Re: [HACKERS] WAL consistency check facility

2016-09-10 Thread Amit Kapila
On Sat, Sep 10, 2016 at 8:33 PM, Robert Haas  wrote:
> On Sat, Sep 10, 2016 at 3:19 AM, Michael Paquier
>  wrote:
>> On Fri, Sep 9, 2016 at 4:01 PM, Kuntal Ghosh  
>> wrote:
> - If WAL consistency check is enabled for a rmgrID, we always include
> the backup image in the WAL record.

 What happens if wal_consistency has different settings on a standby
 and its master? If for example it is set to 'all' on the standby, and
 'none' on the master, or vice-versa, how do things react? An update of
 this parameter should be WAL-logged, no?

>>> It is possible to set wal_consistency to 'All' in master and any other
>>> values in standby. But, the scenario you mentioned will cause error in
>>> standby since it may not get the required backup image for wal
>>> consistency check. I think that user should be responsible to set
>>> this value correctly. We can improve the error message to make the
>>> user aware of the situation.
>>
>> Let's be careful here. You should as well consider things from the
>> angle that some parameter updates are WAL-logged as well, like
>> wal_level with the WAL record XLOG_PARAMETER_CHANGE.
>
> It seems entirely unnecessary for the master and the standby to agree
> here.  I think what we need is two GUCs.  One of them, which affects
> only the master, controls whether the validation information is
> including in the WAL, and the other, which affects only the standby,
> affects whether validation is performed when the necessary information
> is present.
>

I think from the clarity perspective, this option sounds good, but I
am slightly afraid that it might be inconvenient for users to set the
different values for these two parameters.

>  Or maybe skip the second one and just decree that
> standbys will always validate if the necessary information is present.

Sounds like a better alternative and probably easier to configure for users.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-10 Thread Amit Kapila
On Sat, Sep 10, 2016 at 12:49 PM, Michael Paquier
 wrote:
> On Fri, Sep 9, 2016 at 4:01 PM, Kuntal Ghosh  
> wrote:
>
 - In recovery tests (src/test/recovery/t), I've added wal_consistency
 parameter in the existing scripts. This feature doesn't change the
 expected output. If there is any inconsistency, it can be verified in
 corresponding log file.
>>>
>>> I am afraid that just generating a WARNING message is going to be
>>> useless for the buildfarm. If we want to detect errors, we could for
>>> example have an additional GUC to trigger an ERROR or a FATAL, taking
>>> down the cluster, and allowing things to show in red on a platform.
>>>
>> Yes, we can include an additional GUC to trigger an ERROR for any 
>> inconsistency.
>
> I'd like to hear extra opinions about that, but IMO just having an
> ERROR would be fine for the first implementation. Once you've bumped
> into an ERROR, you are likely going to fix it first.
>

+1 for just an ERROR to detect the inconsistency.  I think adding
additional GUC just to raise error level doesn't seem to be advisable.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-10 Thread Amit Kapila
On Sun, Sep 11, 2016 at 12:39 AM, Andres Freund  wrote:
> On 2016-09-10 17:23:21 +0530, Amit Kapila wrote:
>> >
>>
>> I may be missing something here, but why would it contend on a lock,
>> as per locking scheme proposed by Alvaro, access to sequence object
>> will need a share lock on buffer page.
>
> To make checkpointing/bgwriter work correctly we need exclusive locks on
> pages while writing..., or some new lock level preventing the page from
> being written out, while "shared dirtying" locks are being held.
>

Right and I think you have a very valid concern, but if we think that
storing multiple sequences on a same page is a reasonable approach,
then we can invent some locking mechanism as indicated by you such
that two writes on same page won't block each other, but they will be
blocked with bgwriter/checkpointer.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-10 Thread Amit Kapila
On Sun, Sep 11, 2016 at 4:10 AM, Mark Kirkwood
 wrote:
>
>
> performed several 10 hour runs on size 100 database using 32 and 64 clients.
> For the last run I rebuilt with assertions enabled. No hangs or assertion
> failures.
>

Thanks for verification.  Do you think we can do some read-only
workload benchmarking using this server?  If yes, then probably you
can use concurrent hash index patch [1] and cache the metapage patch
[2] (I think Mithun needs to rebase his patch) to do so.



[1] - 
https://www.postgresql.org/message-id/caa4ek1j6b8o4pcepqrxnyblvbftonmjeem+qn0jzx31-obx...@mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/cad__ouhj29cebif_flge4t9vj_-cfxbwcxhjo+d_16txbem...@mail.gmail.com


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2016-09-10 Thread Vitaly Burovoy
On 9/10/16, Jim Nasby  wrote:
> On 9/3/16 6:01 PM, Gavin Flower wrote:
>> I am curious as to the use cases for other possibilities.
>
> A hex or base64 type might be interesting, should those ever get created...
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX

Hex or base64 are not data types. They are just different
representation types of binary sequences.
Even for bigints these representations are done after writing numbers
as byte sequences.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Override compile time log levels of specific messages/modules

2016-09-10 Thread Jim Nasby

On 9/6/16 5:18 AM, Craig Ringer wrote:

I think something automatic that we clearly define as unstable and not
to be relied upon would be preferable. Plus we already have much of the
infrastructure in elog.c as used by errcontext etc.


Actually, I wish this was a straight-up logging level feature, because I 
need it all the time when debugging complicated user-level code. 
Specifically, I wish there was a GUC that would alter 
(client|log)_min_messages upon entering a specific function, either for 
just that function or for anything that function subsequently called. 
Bonus points if you could also specify a regex that the message text had 
to match.


I realize that code-wise that's completely incompatible with what you're 
discussing here, but I think the same API could be used (maybe as a 
different GUC).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2016-09-10 Thread Jim Nasby

On 9/3/16 6:01 PM, Gavin Flower wrote:

I am curious as to the use cases for other possibilities.


A hex or base64 type might be interesting, should those ever get created...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscription

2016-09-10 Thread Jim Nasby

On 9/9/16 6:29 AM, Dmitry Dolgov wrote:

Regarding to the previous conversations [1], [2], here is a patch (with some
improvements from Nikita Glukhov) for the generic type subscription.


Awesome! Please make sure to add it to the Commit Fest app.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-10 Thread Jim Nasby

On 8/1/16 11:38 AM, Bruce Momjian wrote:

I am hoping for a "novice" mode that issues warnings about possible
bugs, e.g. unintentionally-correlated subselect, and this could be part
of that.


Somewhat related; I've recently been wondering about a mode that 
disallows Const's in queries coming from specific roles. The idea there 
is to make it impossible for an application to pass a constant in, which 
would make it impossible for SQL injection to happen. With how magical 
modern frameworks/languages are, it's often impossible to enforce that 
at the application layer.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Alter or rename enum value

2016-09-10 Thread Jim Nasby

On 9/8/16 4:55 AM, Emre Hasegeli wrote:

The main problem that has been discussed before was the indexes.  One
way is to tackle with it is to reindex all the tables after the
operation.  Currently we are doing it when the datatype of indexed
columns change.  So it should be possible, but very expensive.


Why not just disallow dropping a value that's still in use? That's 
certainly what I would prefer to happen by default...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Merge Join with an Index Optimization

2016-09-10 Thread Michael Malis
Hi,

As I understand it, a merge join will currently read all tuples from both
subqueries (besides early termination). I believe it should be possible to
take advantages of the indexes on one or both of the tables being read from
to skip a large number of tuples that would currently be read. As an
example, let's say we are doing equi merge join between two tables with the
following data:

  (3, 4, 5, 9, 10, 11)
  (0, 1, 2, 6, 7, 8)

Currently, even though no tuples would be returned from the merge join, all
of the data will be read from both tables. If there are indexes on both
tables, pg should be able to execute as follows. Get the tuple with value 3
from the first table and then look up the first value greater than 3 in the
second table using the index. In this case, that value would be 6. Since 3
!= 6, pg would then look up the lowest value greater than 6 in the first
table. This process repeats, and tuples are output whenever the values are
equal. This can be thought of as an alternating nested loop join, where the
positions in the indexes are maintained. In the case where only one of the
tables has an index, this can be thought of as a nested loop join where the
inner relation is the table with the index, and the position in that index
is maintained while iterating over the outer relation (the outer relation
would need to be sorted for this to work).

I can't demonstrate in any benchmarks, but I imagine this would
dramatically speed up cases of a merge join between two tables where the
number of tuples that satisfy the join condition is small. I don't know the
Postgres internals well enough to know if there is anything obvious that
would prevent this optimization.

Thanks,
Michael


Re: [HACKERS] feature request: explain "with details" option

2016-09-10 Thread Jim Nasby

On 9/8/16 11:35 PM, Tom Lane wrote:

This isn't simple because there are often *lots* of variants. You
> don't just want to see the "top 10" candidate plans, because they're
> probably a bunch of small variants on the same plan; the ones you'll
> be interested in will probably be very different plans with very bad
> relative estimates.

The other big problem here is that the planner tries *very* hard to reject
losing paths early; so it does not even form an explainable plan for a
large fraction of the search space.  (And if it did, you'd be dead before
you got your EXPLAIN result back.)


What I've wished for is the ability to see plans that were close in cost 
to the best case scenario, since that indicates that a slight change in 
statistics would push the planner in another direction (sometimes with 
disastrous results). Maybe allowing some number of plans to bubble up if 
they were within X percent of the winner wouldn't be that horrible.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let file_fdw access COPY FROM PROGRAM

2016-09-10 Thread Corey Huinker
V2 of this patch:

Changes:
* rebased to most recent master
* removed non-tap test that assumed the existence of Unix sed program
* added non-tap test that assumes the existence of perl
* switched from filename/program to filename/is_program to more closely
follow patterns in copy.c
* slight wording change in C comments


On Thu, Sep 8, 2016 at 6:59 PM, Craig Ringer 
wrote:

> On 9 Sep. 2016 03:45, "Corey Huinker"  wrote:
> >
> >
>
> > Stylistically, would a separate .pl file for the emitter be preferable
> to something inline like
> >
> >> perl -e 'print "a\tb\tcc\t4\n"; print "b\tc\tdd\t5\n"'
>
> I'd be fine with that and a suitable comment. Just be careful with
> different platforms' shell escaping rules.
>
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index b471991..bf9753a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -59,6 +59,7 @@ struct FileFdwOption
 static const struct FileFdwOption valid_options[] = {
/* File options */
{"filename", ForeignTableRelationId},
+   {"program", ForeignTableRelationId},
 
/* Format options */
/* oids option is not supported */
@@ -85,10 +86,11 @@ static const struct FileFdwOption valid_options[] = {
  */
 typedef struct FileFdwPlanState
 {
-   char   *filename;   /* file to read */
-   List   *options;/* merged COPY options, excluding 
filename */
-   BlockNumber pages;  /* estimate of file's physical 
size */
-   double  ntuples;/* estimate of number of rows 
in file */
+   char   *filename;   /* file/program to read */
+   boolis_program; /* true if filename represents 
an OS command */
+   List   *options;/* merged COPY options, excluding 
filename and program */
+   BlockNumber pages;  /* estimate of file or program 
output's physical size */
+   double  ntuples;/* estimate of number of rows 
in file/program output */
 } FileFdwPlanState;
 
 /*
@@ -96,9 +98,10 @@ typedef struct FileFdwPlanState
  */
 typedef struct FileFdwExecutionState
 {
-   char   *filename;   /* file to read */
-   List   *options;/* merged COPY options, excluding 
filename */
-   CopyState   cstate; /* state of reading file */
+   char   *filename;   /* file/program to read */
+   boolis_program; /* true if filename represents 
an OS command */
+   List   *options;/* merged COPY options, excluding 
filename and is_program */
+   CopyState   cstate; /* state of reading file or 
program */
 } FileFdwExecutionState;
 
 /*
@@ -139,7 +142,9 @@ static bool fileIsForeignScanParallelSafe(PlannerInfo 
*root, RelOptInfo *rel,
  */
 static bool is_valid_option(const char *option, Oid context);
 static void fileGetOptions(Oid foreigntableid,
-  char **filename, List **other_options);
+  char **filename,
+  bool *is_program,
+  List **other_options);
 static List *get_file_fdw_attribute_options(Oid relid);
 static bool check_selective_binary_conversion(RelOptInfo *baserel,
  Oid 
foreigntableid,
@@ -196,16 +201,17 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 
/*
 * Only superusers are allowed to set options of a file_fdw foreign 
table.
-* This is because the filename is one of those options, and we don't 
want
-* non-superusers to be able to determine which file gets read.
+* The reason for this is to prevent non-superusers from changing the 
+* definition to access an arbitrary file not visible to that user
+* or to run programs not accessible to that user.
 *
 * Putting this sort of permissions check in a validator is a bit of a
 * crock, but there doesn't seem to be any other place that can enforce
 * the check more cleanly.
 *
-* Note that the valid_options[] array disallows setting filename at any
-* options level other than foreign table --- otherwise there'd still 
be a
-* security hole.
+* Note that the valid_options[] array disallows setting filename and
+* program at any options level other than foreign table --- otherwise
+* there'd still be a security hole.
 */
if (catalog == ForeignTableRelationId && !superuser())
ereport(ERROR,
@@ -247,11 +253,11 @@ file_fdw_validator(PG_FUNCTION_ARGS)
}
 
/*
-* Separate out filename and column-specific options, since

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-10 Thread Mark Kirkwood



On 09/09/16 14:50, Mark Kirkwood wrote:


Yeah, good suggestion about replacing (essentially) all the indexes 
with hash ones and testing. I did some short runs with this type of 
schema yesterday (actually to get a feel for if hash performance vs 
btree was compareable - does seem tp be) - but probably longer ones 
with higher concurrency (as high as I can manage on a single socket i7 
anyway) is a good plan. If Ashutosh has access to seriously large 
numbers of cores then that is even better :-)





I managed to find a slightly bigger server (used our openstack cloud to 
run a 16 cpu vm). With the schema modified as follows:


bench=# \d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  | Type  | Modifiers
--+---+---
 aid  | integer   | not null
 bid  | integer   |
 abalance | integer   |
 filler   | character(84) |
Indexes:
"pgbench_accounts_pkey" hash (aid)

bench=# \d pgbench_branches
   Table "public.pgbench_branches"
  Column  | Type  | Modifiers
--+---+---
 bid  | integer   | not null
 bbalance | integer   |
 filler   | character(88) |
Indexes:
"pgbench_branches_pkey" hash (bid)

bench=# \d pgbench_tellers
Table "public.pgbench_tellers"
  Column  | Type  | Modifiers
--+---+---
 tid  | integer   | not null
 bid  | integer   |
 tbalance | integer   |
 filler   | character(84) |
Indexes:
"pgbench_tellers_pkey" hash (tid)

bench=# \d pgbench_history
  Table "public.pgbench_history"
 Column |Type | Modifiers
+-+---
 tid| integer |
 bid| integer |
 aid| integer |
 delta  | integer |
 mtime  | timestamp without time zone |
 filler | character(22)   |
Indexes:
"pgbench_history_pkey" hash (bid)

performed several 10 hour runs on size 100 database using 32 and 64 
clients. For the last run I rebuilt with assertions enabled. No hangs or 
assertion failures.


regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-10 Thread Tom Lane
Christian Convey  writes:
>If that's correct, then it sounds like the only way Joy's commit has
>a chance of acceptance is if Peter's commit is rejected.
>Because Peter's commit might be merged as part of the 2016-09
>commitfest, but Joy's can show up until 2016-11 at the earliest.

No, we're not that rigid about it.  The commitfest mechanism is really
mostly meant to ensure that every submission does get looked at within a
reasonable amount of time and doesn't get forgotten about.  (We do tend
to be willing to tell new patches they've got to wait till the next fest,
but that's mainly when we're already feeling overloaded by what's in the
current queue.)  In this case it would be nonsensical to consider only
Peter's submission and not Joy's, because part of the issue is whether
one's better than the other.

> There seems to be a little ambiguity regarding the exact version of
> the code to be reviewed.  This is true for both Joy's and Peter's
> submissions:
>* Joy's email provides a link to a Github repo, but does not specify
>  a particular commit (or even branch) in that repo: [2]
>* In the email thread, Peter did provide a patch set: [3]
>  but the corresponding commitfest entry references a github branch: [4]

Well, at the point we're at here, it's probably not that important exactly
which version of a patchset you're looking at; I'd take the latest.

> Q4) Do we require that any submitted patches appear as attachments on
> the pgsql-hackers email list, or is a github URL good enough?

Actually, we *do* require that, mainly as a formality to show that the
author intended to submit it for inclusion in Postgres.  (We don't want
people coming back later and saying "hey, you ripped off this code from
my github repo without my permission".)  If we were seriously considering
adopting Joy's version then that would have to happen before anything got
merged.  But at this point it seems like we're in a very exploratory
phase and there's no need for legal niceties yet.

> Q5) (This question is more generic.)  I'm accustomed to using Github's
> pull-request system, where I can engage in dialog regarding specifc
> lines of a patch.  I haven't noticed anything similar being used for
> PG code reviews, but perhaps I'm just looking in the wrong places.
> Are all PG code reviews basically just back-and-forth email
> conversations on the pgsql-hackers list?

Yup, that's how we roll ... it's ancient habit from before there was
such a thing as git, let alone github, but we've not seen fit to change.
It does have the advantage that it's about equally amenable to high-
level discussions and line-by-line issues.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-10 Thread Christian Convey
Hi Heikki,

Could I ask you a newbie-reviewer question about something I'm seeing
here?  https://commitfest.postgresql.org/10/776/

>From some reading I've done (e.g., Stephen Frost's PGCon 2011 slides),
I got the impression that a successful patch would always have this
sequence of states in commitfest:
  1. patch-record created
  ...
  2. Needs Review
  ...
  3. Ready for Committer

But if I'm reading the patch's activity log correctly, it looks like
you marked the patch as "Ready for Committer" (2016-09-06 18:59:02)
without any record of it having been reviewed.

Was that intentional?

Thanks very much,
Christian

P.S. I'm asking because I was planning to review that patch.  But I
can't tell if any more review by a non-committer is still required by
the commitfest workflow.

Kind regards,
Christian

On Tue, Sep 6, 2016 at 3:15 PM, Christian Convey
 wrote:
> On Tue, Sep 6, 2016 at 3:12 PM, Tom Lane  wrote:
>>> (2) It seems like there are still a few big questions about this commit:
>>>- Is it wanted at the moment?  It didn't seem like there's a
>>>  consensus about whether or not this enhancement should be
>>>  merged, even if the patch is pretty minimal.
>>>- It seems like there are two competing patch
>>>  sets in play for this enhancement: Joy's and
>>>  Peter's.  Presumably at most one of them would
>>>  be merged.
>>
>> These are things that reviews should be helping to decide.  It's probably
>> a squishier topic than some patches, but if you're interested, feel free
>> to read code and weigh in.
>
> Thanks. It sounds like worst-case scenario, I perform an unneeded
> review.  I'll give it a shot.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Robins Tharakan
>
>
> Without having at least compared EXPLAIN outputs from the two boxes, you
> have no business jumping to that conclusion.
>
> If EXPLAIN does show different plans, my first instinct would be to wonder
> whether the pg_stats data is equally up-to-date on both boxes.
>
> regards, tom lane
>

Thanks. EXPLAIN plans were different but (don't have them now and) didn't
know system catalogs were so severely affected by outdated statistics as
well (which is why I was looking for any other reasons I might be missing). I
reckon an ANALYSE; should solve this? ... Would get back if I have
something else to offer.

-
robins
-- 

-
robins


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-10 Thread Christian Convey
> Thanks. It sounds like worst-case scenario, I perform an unneeded
> review.  I'll give it a shot.

Hi guys,

Apologies for more boring process-related questions, but any pointers
would be greatly appreciated...

I'm a bit confused about how PG's code-review process is meant to
handle this C++ port.  My confusion may stem from the combination of
my inexperience with the process, and there being two competing patch
sets.

Here's some background:

* My intention was to review Joy's patch.

* On "commitfest.postgresql.org" (for 2016-09), the only C++
  -related patch I found was Peter's: [1]

* I wrongly assumed that the commitfest entry would be for
  Joy's patch, not Peter's, so I signed up as its reviewer.
  (That's fine - I don't mind reviewing both authors' patch
  sets.)

But here are my questions:

Q1) My understanding of PG's code-review process is that it's a pipeline:
Step 1. The discussion starts on the pgsql-hackers mailing list, where
the author posts a patch.  He/she may also post revised patches
based on the discussion.

Step 2. A subset of those discussions are modeled by new entries in
the commitfest website.

Step 3. A subset of those commitfest items get merged.

   If that's correct, then it sounds like the only way Joy's commit has
   a chance of acceptance is if Peter's commit is rejected.
   Because Peter's commit might be merged as part of the 2016-09
   commitfest, but Joy's can show up until 2016-11 at the earliest.

   Is my understanding correct?

There seems to be a little ambiguity regarding the exact version of
the code to be reviewed.  This is true for both Joy's and Peter's
submissions:
   * Joy's email provides a link to a Github repo, but does not specify
 a particular commit (or even branch) in that repo: [2]

   * In the email thread, Peter did provide a patch set: [3]
 but the corresponding commitfest entry references a github branch: [4]

So I have a few questions here:

Q2) Are authors expected to submit an unambiguous patch to frame the
discussion?  (I.e,. a specific patch file, or a specific git commit
hash, as opposed to a github repo or a github branch.)

Q3) Are authors expected to submit a single patch/commit, or is it
acceptable / desirable for a large patch to be broken up as Peter has
done?

Q4) Do we require that any submitted patches appear as attachments on
the pgsql-hackers email list, or is a github URL good enough?

Q5) (This question is more generic.)  I'm accustomed to using Github's
pull-request system, where I can engage in dialog regarding specifc
lines of a patch.  I haven't noticed anything similar being used for
PG code reviews, but perhaps I'm just looking in the wrong places.
Are all PG code reviews basically just back-and-forth email
conversations on the pgsql-hackers list?

Thanks,
Christian

[1] https://commitfest.postgresql.org/10/776/
[2] 
https://www.postgresql.org/message-id/CABgyVxDBd3EvRdo-Rd6eo8QPEqV8%3DShaU2SJfo16wfE0R-hXTA%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/bf9de63c-b669-4b8c-d33b-4a5ed11cd5d4%402ndquadrant.com
[4] https://github.com/petere/postgresql/tree/commitfest/c%2B%2B


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Tom Lane
Robins Tharakan  writes:
> I completely agree. With 'irrelevant' I was only trying to imply that
> irrespective of the complexity of the query, a replicated box was seeing
> similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
> isn't to blame here...

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Robins Tharakan
On Fri, 9 Sep 2016 at 09:39 Andres Freund  wrote:

> On 2016-09-07 23:37:31 +, Robins Tharakan wrote:
> > If someone asks for I could provide SQL + EXPLAIN, but it feels
> irrelevant
> > here.
>
> Why is that? information_schema are normal sql queries, and some of them
> far from trivial.
>
> Andres
>
Hi Andres,

I completely agree. With 'irrelevant' I was only trying to imply that
irrespective of the complexity of the query, a replicated box was seeing
similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
isn't to blame here...

In effect, I was trying to ask if I am forgetting / missing something very
obvious / important that could cause such an observation.

As others recommended, I am unable to have direct access to the production
(master / slave) instances and so GDB / stack trace options are out of
bounds at this time. I'll revert if I am able to do that.

-
thanks
robins



-- 

-
robins


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-10 Thread Tom Lane
I wrote:
> I looked into this a little.  There are at least three things we could
> do here:
> 1. Decide that showing walsenders is a good thing.  I'm not really
> sure why it isn't -- for example, we could take the trouble to display
> the current replication command as the walsender's activity.
> 2. Suppress walsenders by the expedient of ignoring PGPROCs that have
> zero database ID.  This would also ignore other process types that aren't
> connected to a specific database.  This is pretty backwards-compatible
> because it's equivalent to what used to happen implicitly via the inner
> join in the view.
> 3. Suppress walsenders by adding some new field to PgBackendStatus that
> identifies them, and having pg_stat_get_activity() use that to filter.

BTW, I had been thinking we could implement #2 or #3 relatively painlessly
by adding filter logic inside pg_stat_get_activity().  However, I now
notice that that function is also used by the pg_stat_replication view,
which joins its output to pg_stat_get_wal_senders().  That means we can
*not* simply suppress walsenders in the function's output; if we want to
filter them away in pg_stat_activity, we have to change the view
definition somehow, and probably the function's API as well.  And that
means we can't change the behavior without an initdb, which is kind of
annoying to have to do after rc1.

The fact that the pg_stat_replication view does show walsender processes
seems like possibly a reasonable argument for not showing them in
pg_stat_activity.  But we'd have to do some rejiggering of the view
definition to make that happen.

Don't have a strong opinion which way it ought to work --- I could go with
either the "walsenders are a perfectly good activity" position or the
"they should appear only in pg_stat_replication" one.  But the latter will
take an initdb to do, and I'm inclined to the position that it's too late
for that in 9.6.  As I noted, it was already inconsistent for per-db
walsenders long before 9.6, without field complaints, so I'm having a hard
time seeing this as a critical bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature request: explain "with details" option

2016-09-10 Thread Jeff Janes
On Thu, Sep 8, 2016 at 10:40 AM, Roger Pack  wrote:

> My apologies if this was already requested before...
>
> I think it would be fantastic if postgres had an "explain the explain"
> option:
> Today's explain tells us what loops and scans were used, and relative
> costs, etc.  It doesn't seem to tell *why* the planner elected to use
> what it did.
>
> For instance, in the case of a corrupted index, it doesn't say why
> it's not using that index, it just doesn't use it, causing some
> confusion to end users.  At least causing confusion to me.
>

I've never seen such a thing.  If an index is corrupt, it still gets used
like normal.  You just get wrong results, or crashes, depending on the
nature of the corruption.

Or in the case of where it iterates over an entire table (seq. scan)
> instead of using an index because the index range specified "is most
> of the table" (thus not helpful to use the index)



The planner just comes up with plans that use a seq scan, and plans that
use an index; and then compares the cost of them and finds that one cost is
lower than the other.  It never explicitly develops a specific notion of "I
won't use the index because I'm retrieving too much of the table".  So it
wouldn't be just a matter of reporting something that isn't currently
reported, it would first have to infer that thing in the first place, and
that would probably be very hard.



> ...The choice is
> appropriate.  The reasoning why is not explicitly mentioned.  Again
> causing possibility for some delay as you try to "decipher the mind"
> of the planner.  Sometimes tables (ex: tables after having been first
> propagated) need an "analyze" run on them, but it's not clear from an
> "explain" output that the analyze statistics are faulty.  Not even a
> hint.
>

You can get a hint, sometimes, by comparing the predicted rows and the
actual rows of an "explain (analyze)".  Making this more user friendly to
do would probably be best done by making an expert-system tool to look at
the currently-reported plans (https://explain.depesz.com kind of does this
already, particularly the row_x column) rather than trying to build
something into core.  It could be improved by detecting when the node being
misestimated is simple scan of a single table or index with a single
filter/condition, rather than a join or a complex filter/condition.


>
> So this is a feature request for an "EXPLAIN DETAILS" option or
> something, basically like today's explain but with more "rationale"
> included.  This could be immensely useful to many Postgres users.
>

Unfortunately it would also be immensely hard to implement.

What I would find useful is simply to have it report details about how the
cost of each node of the plan was arrived at, i.e. how many of multiples of
each of the *_cost factors were summed to arrive at the total cost.  that
would help me a lot in interpreting plans, and might also help someone like
Depesz a lot in improving his expert-system.

It would be far easier than what you are proposing, but would be still be a
lot of work.  It would probably also be challenged because the accounting
overhead, while small, would be incurred on every single planner run.

Cheers,

Jeff


Re: [HACKERS] Tuplesort merge pre-reading

2016-09-10 Thread Peter Geoghegan
On Sat, Sep 10, 2016 at 12:04 AM, Heikki Linnakangas  wrote:
>> Did I misunderstand something? I'm applying the first patch of Peter's
>> series (cap number of tapes), then your first one (remove prefetch)
>> and second one (use larger read buffers).
>
>
> Yes. I have been testing without Peter's first patch, with just the two
> patches I posted. But it should work together (and does, I just tested) with
> that one as well.

You're going to need to rebase, since the root displace patch is based
on top of my patch 0002-*, not Heikki's alternative. But, that should
be very straightforward.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-10 Thread Andres Freund
On 2016-09-10 17:23:21 +0530, Amit Kapila wrote:
> On Thu, Sep 1, 2016 at 12:00 AM, Andres Freund  wrote:
> > On 2016-08-31 14:23:41 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> >> > On 2016-08-31 13:59:48 -0400, Tom Lane wrote:
> >> >> You are ignoring the performance costs associated with eating 100x more
> >> >> shared buffer space than necessary.
> >>
> >> > I doubt that's measurable in any real-world scenario. You seldomly have
> >> > hundreds of thousands of sequences that you all select from at a high
> >> > rate.
> >>
> >> If there are only a few sequences in the database, cross-sequence
> >> contention is not going to be a big issue anyway.
> >
> > Isn't that *precisely* when it's going to matter? If you have 5 active
> > tables & sequences where the latter previously used independent locks,
> > and they'd now be contending on a single lock.
> >
> 
> I may be missing something here, but why would it contend on a lock,
> as per locking scheme proposed by Alvaro, access to sequence object
> will need a share lock on buffer page.

To make checkpointing/bgwriter work correctly we need exclusive locks on
pages while writing..., or some new lock level preventing the page from
being written out, while "shared dirtying" locks are being held.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-10 Thread Andrew Borodin
>Personally I wouldn't be very happy about an IEEE754 assumption.
Ok, so let's avoid autoconf man. There is no real explanations of the
ground for this assumption, just a reference to paper of David
Goldberg (and there is no metion about IEEE754 is absoulte
everywhere). BTW, may be we can ask David Goldberg how to hack that
float properly? I do  not see any math sense in this:

pack_float(float actualValue, int realm)
{
...
  int realmAjustment = *((int*))/4;
  float something = *((float*))
...
}
No wonder it's not in libs.
Nither I see a way to implement it with ldexp siblings.
>I did go to the trouble of testing Postgres on a VAX and we fixed the few
instances where it had such dependencies for a net gain.
Greg, could you please point out those places to see how exaclty it
should be #ifdef'd?

Regrads, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-10 Thread Peter Geoghegan
On Fri, Sep 9, 2016 at 5:22 PM, Claudio Freire  wrote:
> Since it is true that doing so would make it impossible to keep the
> asserts about tupindex in tuplesort_heap_root_displace, I guess it
> depends on how useful those asserts are (ie: how likely it is that
> those conditions could be violated, and how damaging it could be if
> they were). If it is decided the refactor is desirable, I'd suggest
> making the common siftup producedure static inline, to allow
> tuplesort_heap_root_displace to inline and specialize it, since it
> will be called with checkIndex=False and that simplifies the resulting
> code considerably.

Right. I want to keep it as a separate function for all these reasons.
I also think that I'll end up further optimizing what I've called
tuplesort_heap_root_displace in the future, to adopt to clustered
input. I'm thinking of something like Timsort's "galloping mode". What
I've come up with here still needs 2 comparisons and a swap per call
for presorted input. There is still a missed opportunity for clustered
or (inverse) correlated input -- we can make merging opportunistically
skip ahead to determine that the root tape's 100th tuple (say) would
still fit in the root position of the merge minheap. So, immediately
return 100 tuples from the root's tape without bothering to compare
them to anything. Do a binary search to find the best candidate
minheap root before the 100th tuple if a guess of 100 doesn't work
out. Adapt to trends. Stuff like that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-10 Thread Tom Lane
I wrote:
> Fujii Masao  writes:
>> On Sat, Aug 20, 2016 at 6:13 AM, Tom Lane  wrote:
>>> Use LEFT JOINs in some system views in case referenced row doesn't exist.

>> This change causes pg_stat_activity to report the "bogus" information about
>> walsenders as follows.

> Hmm ... but if we want to exclude walsenders from that view, surely we
> should do so explicitly, not have it happen as a hidden side-effect
> of not being able to join them to pg_database.

I looked into this a little.  There are at least three things we could
do here:

1. Decide that showing walsenders is a good thing.  I'm not really
sure why it isn't -- for example, we could take the trouble to display
the current replication command as the walsender's activity.

2. Suppress walsenders by the expedient of ignoring PGPROCs that have
zero database ID.  This would also ignore other process types that aren't
connected to a specific database.  This is pretty backwards-compatible
because it's equivalent to what used to happen implicitly via the inner
join in the view.

3. Suppress walsenders by adding some new field to PgBackendStatus that
identifies them, and having pg_stat_get_activity() use that to filter.


It looks to me like even before the change Fujii-san complains of, it was
only generic walsenders that were hidden in the view.  Database-specific
walsenders (am_db_walsender) would have a valid database OID in
PgBackendStatus and therefore would pass the join.  This makes me think
that option #1 is really the sanest alternative.  While option #2 is
the most backwards-compatible, it's pretty hard to see why we would think
that showing only database-specific walsenders is a desirable behavior.
If we want to exclude walsenders altogether, we need option #3.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-10 Thread Tom Lane
Michael Paquier  writes:
> Thanks. I was a bit too lazy to look at the history to get this piece
> of history out... And so the code that is presently in the MSVC
> scripts should have been updated at the same time as those
> compilations have been relaxed, but it got forgotten on the way I
> guess. Then it seemt to me that the attached patch would do things as
> they should be done: removal of the condition for iconv, which is an
> optional flag for libxml2, and reversing the check for libxslt <->
> libxml2.

Actually, wait a minute.  Doesn't this bit in Mkvcbuild.pm need adjusted?

if ($solution->{options}->{xml})
{
$contrib_extraincludes->{'pgxml'} = [
$solution->{options}->{xml} . '/include',
$solution->{options}->{xslt} . '/include',
$solution->{options}->{iconv} . '/include' ];

$contrib_extralibs->{'pgxml'} = [
$solution->{options}->{xml} . '/lib/libxml2.lib',
$solution->{options}->{xslt} . '/lib/libxslt.lib' ];
}

It might accidentally fail to fail as-is, but likely it would be better
not to be pushing garbage paths into extraincludes and extralibs when
some of those options aren't set.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-10 Thread Tom Lane
Michael Paquier  writes:
> Thanks. I was a bit too lazy to look at the history to get this piece
> of history out... And so the code that is presently in the MSVC
> scripts should have been updated at the same time as those
> compilations have been relaxed, but it got forgotten on the way I
> guess. Then it seemt to me that the attached patch would do things as
> they should be done: removal of the condition for iconv, which is an
> optional flag for libxml2, and reversing the check for libxslt <->
> libxml2.

Looks sane to me, will push.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY command with RLS bug

2016-09-10 Thread Adam Brightwell
> Looking for and improving test coverage for RLS is a good suggestion,
> but let's not link the fate of the issue reported here with this
> requirement. I have spent some time looking at this patch and this
> looks in rather good shape to me (you even remembered to use the
> prefix regress_* for the role name that you are adding!). So I have
> marked this bug fix as ready for committer.

Excellent, thanks for the review and feedback. I appreciate it.

-Adam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow to_date() and to_timestamp() to accept localized month names

2016-09-10 Thread Tom Lane
Mattia  writes:
> attached is a patch which adds support to localized month names in
> to_date() and to_timestamp() functions.

Seems like a fine goal.

> I thought about reusing from_char_seq_search() but localized month
> names use different capitalization according to the language grammar,
> so I used pg_strncasecmp to do the match.

pg_str(n)casecmp is really only meant to handle comparisons of ASCII
strings; it will definitely not succeed in case-folding multibyte
characters.  That's not a big problem for to_date's existing usages
but I'm afraid it will be for non-English month names.  I think you'll
need another solution there.  You might have to resort to what citext
does, namely apply the full lower() transformation, at least whenever
the data string actually contains MB characters.

> Regression tests with TM modifier are difficult since one should have
> the locale used for the test installed on his system.

I suspect you'll have to give up on putting much about this into the
standard regression tests.  We've used not-run-by-default test scripts
in some similar cases (eg collate.linux.utf8.sql), but personally I think
those are 99% a waste of time, precisely because they never actually
get run by anyone but the author.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-10 Thread Robert Haas
On Sat, Sep 10, 2016 at 3:19 AM, Michael Paquier
 wrote:
> On Fri, Sep 9, 2016 at 4:01 PM, Kuntal Ghosh  
> wrote:
 - If WAL consistency check is enabled for a rmgrID, we always include
 the backup image in the WAL record.
>>>
>>> What happens if wal_consistency has different settings on a standby
>>> and its master? If for example it is set to 'all' on the standby, and
>>> 'none' on the master, or vice-versa, how do things react? An update of
>>> this parameter should be WAL-logged, no?
>>>
>> It is possible to set wal_consistency to 'All' in master and any other
>> values in standby. But, the scenario you mentioned will cause error in
>> standby since it may not get the required backup image for wal
>> consistency check. I think that user should be responsible to set
>> this value correctly. We can improve the error message to make the
>> user aware of the situation.
>
> Let's be careful here. You should as well consider things from the
> angle that some parameter updates are WAL-logged as well, like
> wal_level with the WAL record XLOG_PARAMETER_CHANGE.

It seems entirely unnecessary for the master and the standby to agree
here.  I think what we need is two GUCs.  One of them, which affects
only the master, controls whether the validation information is
including in the WAL, and the other, which affects only the standby,
affects whether validation is performed when the necessary information
is present.  Or maybe skip the second one and just decree that
standbys will always validate if the necessary information is present.
Using the same GUC on both the master and the standby but making it
mean different things in each of those places (whether to log the
validation info in one case, whether to perform validation in the
other case) is another option that also avoids needing to enforce that
the setting is the same in both places, but probably an inferior one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An extra error for client disconnection on Windows

2016-09-10 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Sep 9, 2016 at 11:39 PM, Tom Lane  wrote:
>> So this change would deal nicely with the "peer application on the remote
>> host is suddenly stopped" case, at the price of being not nice about any
>> of the other cases.  Not convinced it's a good tradeoff.

> Yes, in the list of failure cases that could trigger this error, the
> one that looks like a problem is to me is when a network interface is
> disabled. It may be a good idea to let users know via the logs that
> something was connected. Could we for example log a WARNING message,
> and not report an error?

It isn't an "error".  These conditions get logged at COMMERROR which is
effectively LOG_SERVER_ONLY.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-10 Thread Amit Kapila
On Thu, Sep 1, 2016 at 12:00 AM, Andres Freund  wrote:
> On 2016-08-31 14:23:41 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>> > On 2016-08-31 13:59:48 -0400, Tom Lane wrote:
>> >> You are ignoring the performance costs associated with eating 100x more
>> >> shared buffer space than necessary.
>>
>> > I doubt that's measurable in any real-world scenario. You seldomly have
>> > hundreds of thousands of sequences that you all select from at a high
>> > rate.
>>
>> If there are only a few sequences in the database, cross-sequence
>> contention is not going to be a big issue anyway.
>
> Isn't that *precisely* when it's going to matter? If you have 5 active
> tables & sequences where the latter previously used independent locks,
> and they'd now be contending on a single lock.
>

I may be missing something here, but why would it contend on a lock,
as per locking scheme proposed by Alvaro, access to sequence object
will need a share lock on buffer page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Change the way that LWLocks for extensions are allocated.

2016-09-10 Thread Amit Kapila
On Tue, Aug 30, 2016 at 8:01 AM, Andres Freund  wrote:
> On 2016-08-30 07:57:19 +0530, Amit Kapila wrote:
>> I will write such a test case either in this week or early next week.
>
> Great.
>

Okay, attached patch adds some simple tests for pg_stat_statements.
One thing to note is that we can't add pg_stat_statements tests for
installcheck as this module requires shared_preload_libraries to be
set.  Hope this suffices the need.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


test_pg_stat_statements-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improved DefElem list processing

2016-09-10 Thread Peter Eisentraut
On 8/22/16 10:28 AM, Alvaro Herrera wrote:
> Peter Eisentraut wrote:
> 
>> I'm not happy that utils/acl.h has prototypes for aclchk.c, because
>> acl.h is included all over the place.  Perhaps I should make a
>> src/include/catalog/aclchk.c to clean that up.
> 
> I've been bothered by that too in the past.  +1 for the cleanup.

Here is a patch for that.

It didn't quite achieve the elegance I was hoping for.  Most uses of
acl.h actually use aclchk.c functions, and the new aclchk.h must include
acl.h, so basically you end up just changing most includes of acl.h to
aclchk.h while still effectively including acl.h everywhere.  But I
think having one header file serve two separate .c files is still a
confusing pattern that is worth cleaning up.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d4f9090..06041f0 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -38,6 +38,7 @@
 
 #include "access/htup_details.h"
 #include "access/reloptions.h"
+#include "catalog/aclchk.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_foreign_server.h"
@@ -50,7 +51,6 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "parser/scansup.h"
-#include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index c3a518c..d312d74 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -16,12 +16,12 @@
 #include 
 
 #include "access/heapam.h"
+#include "catalog/aclchk.h"
 #include "catalog/catalog.h"
 #include "fmgr.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
-#include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index e20e7f8..1c8fff0 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -27,12 +27,12 @@
 #include "access/multixact.h"
 #include "access/relscan.h"
 #include "access/xact.h"
+#include "catalog/aclchk.h"
 #include "catalog/namespace.h"
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
-#include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 1b45a4c..4a48e92 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -22,6 +22,7 @@
 #include "access/reloptions.h"
 #include "access/relscan.h"
 #include "access/xloginsert.h"
+#include "catalog/aclchk.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "miscadmin.h"
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index b56d0e3..a1b213e 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -20,10 +20,10 @@
 #include "postgres.h"
 
 #include "access/htup_details.h"
+#include "catalog/aclchk.h"
 #include "catalog/pg_type.h"
 #include "miscadmin.h"
 #include "parser/parse_type.h"
-#include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/resowner_private.h"
 #include "utils/syscache.h"
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 6b709db..2305782 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -22,11 +22,11 @@
 #include "access/xloginsert.h"
 #include "access/xlog.h"
 #include "commands/vacuum.h"
+#include "catalog/aclchk.h"
 #include "catalog/pg_am.h"
 #include "miscadmin.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
-#include "utils/acl.h"
 #include "postmaster/autovacuum.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 65c941d..c3a1997 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -21,11 +21,11 @@
 
 #include "access/relscan.h"
 #include "access/transam.h"
+#include "catalog/aclchk.h"
 #include "catalog/index.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
-#include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index a585c3a..9face59 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -22,6 +22,7 @@
 #include "access/htup_details.h"
 #include "access/sysattr.h"
 #include "access/xact.h"
+#include "catalog/aclchk.h"
 #include "catalog/binary_upgrade.h"
 #include "catalog/catalog.h"
 #include "catalog/dependency.h"
@@ -59,7 +60,6 @@
 #include "nodes/makefuncs.h"
 #include 

Re: [HACKERS] pg_sequence catalog

2016-09-10 Thread Peter Eisentraut
On 9/5/16 10:35 PM, Tom Lane wrote:
> In this viewpoint, we'd keep the sequence-specific data in a pg_sequence
> catalog.  pg_sequence rows would be extensions of the associated pg_class
> rows in much the same way that pg_index rows extend the pg_class entries
> for indexes.  We should supply a view pg_sequences that performs the
> implied join, and encourage users to select from that rather than directly
> from pg_sequence (compare pg_indexes view).

Let's start with that.  Here is a patch that adds a pg_sequences view in
the style of pg_tables, pg_indexes, etc.  This seems useful independent
of anything else, but would give us more freedom to change things around
behind the scenes.

A slight naming wart: I added a function lastval(regclass) for internal
use to get a sequence's "last value".  But we already have a public
function lastval(), which gets the most recent nextval() result of any
sequence.  Those are two quite different things.  I don't want to
abandon the term "last value" here, however, because that is what the
sequence relation uses internally, and also Oracle uses it in its system
views with the same semantics that I propose here.  We could use a more
verbose name like sequence_last_value(regclass), perhaps.

lastval has been kept separate from pg_sequence_parameters, because if
we were to go ahead with a new catalog layout later, then
pg_sequence_parameters would become obsolescent while we would possibly
still need a lastval function.

The column names of the new view have been deliberately tuned to use a
more conventional style than the information schema while avoiding what
I would consider some past naming mistakes.  (For example, I hate
"is_cycled", which reads like "sequence has wrapped around at least once
in the past").

Here are some similar views in other places:

https://docs.oracle.com/cd/B28359_01/server.111/b28320/statviews_2053.htm
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r0004203.html
https://msdn.microsoft.com/en-us/library/ff877934.aspx

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From f74e1cc1f6ee4a56abc9f46c413c0af5086e1e40 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 9 Sep 2016 12:00:00 -0400
Subject: [PATCH] Add pg_sequences view

Like pg_tables, pg_views, and others, this view contains information
about sequences in a way that is independent of the system catalog
layout but more comprehensive than the information schema.
---
 doc/src/sgml/catalogs.sgml   | 90 
 src/backend/catalog/system_views.sql | 16 ++
 src/backend/commands/sequence.c  | 45 ++--
 src/include/catalog/pg_proc.h|  4 +-
 src/include/commands/sequence.h  |  1 +
 src/test/regress/expected/rules.out  | 14 +
 src/test/regress/expected/sequence.out   | 16 ++
 src/test/regress/expected/sequence_1.out | 16 ++
 src/test/regress/sql/sequence.sql|  7 +++
 9 files changed, 205 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 322d8d6..1c440e3 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7379,6 +7379,11 @@ System Views
  
 
  
+  pg_sequences
+  sequences
+ 
+
+ 
   pg_settings
   parameter settings
  
@@ -9114,6 +9119,91 @@ pg_seclabels Columns
   
  
 
+ 
+  pg_sequences
+
+  
+   pg_sequences
+  
+
+  
+   The view pg_sequences provides access to
+   useful information about each sequence in the database.
+  
+
+  
+   pg_sequences Columns
+
+   
+
+ 
+  Name
+  Type
+  References
+  Description
+ 
+
+
+ 
+  schemaname
+  name
+  pg_namespace.nspname
+  Name of schema containing sequence
+ 
+ 
+  sequencename
+  name
+  pg_class.relname
+  Name of sequence
+ 
+ 
+  sequenceowner
+  name
+  pg_authid.rolname
+  Name of sequence's owner
+ 
+ 
+  start_value
+  bigint
+  
+  Start value of the sequence
+ 
+ 
+  min_value
+  bigint
+  
+  Minimum value of the sequence
+ 
+ 
+  max_value
+  bigint
+  
+  Maximum value of the sequence
+ 
+ 
+  increment_by
+  bigint
+  
+  Increment value of the sequence
+ 
+ 
+  cycle
+  boolean
+  
+  Whether the sequence cycles
+ 
+ 
+  cache_size
+  bigint
+  
+  Cache size of the sequence
+ 
+
+   
+  
+
+ 
+
  
   pg_settings
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ada2142..99a9b41 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,22 @@ CREATE VIEW pg_indexes AS
  LEFT JOIN pg_tablespace T ON (T.oid 

Re: [HACKERS] Block level parallel vacuum WIP

2016-09-10 Thread Pavan Deolasee
On Wed, Aug 24, 2016 at 3:31 AM, Michael Paquier 
wrote:

> On Tue, Aug 23, 2016 at 10:50 PM, Amit Kapila 
> wrote:
> > On Tue, Aug 23, 2016 at 6:11 PM, Michael Paquier
> >  wrote:
> >> On Tue, Aug 23, 2016 at 8:02 PM, Masahiko Sawada 
> wrote:
> >>> As for PoC, I implemented parallel vacuum so that each worker
> >>> processes both 1 and 2 phases for particular block range.
> >>> Suppose we vacuum 1000 blocks table with 4 workers, each worker
> >>> processes 250 consecutive blocks in phase 1 and then reclaims dead
> >>> tuples from heap and indexes (phase 2).
> >>
> >> So each worker is assigned a range of blocks, and processes them in
> >> parallel? This does not sound performance-wise. I recall Robert and
> >> Amit emails on the matter for sequential scan that this would suck
> >> performance out particularly for rotating disks.
> >>
> >
> > The implementation in patch is same as we have initially thought for
> > sequential scan, but turned out that it is not good way to do because
> > it can lead to inappropriate balance of work among workers.  Suppose
> > one worker is able to finish it's work, it won't be able to do more.
>
> Ah, so it was the reason. Thanks for confirming my doubts on what is
> proposed.
> --
>

I believe Sawada-san has got enough feedback on the design to work out the
next steps. It seems natural that the vacuum workers are assigned a portion
of the heap to scan and collect dead tuples (similar to what patch does)
and the same workers to be responsible for the second phase of heap scan.

But as far as index scans are concerned, I agree with Tom that the best
strategy is to assign a different index to each worker process and let them
vacuum indexes in parallel. That way the work for each worker process is
clearly cut out and they don't contend for the same resources, which means
the first patch to allow multiple backends to wait for a cleanup buffer is
not required. Later we could extend it further such multiple workers can
vacuum a single index by splitting the work on physical boundaries, but
even that will ensure clear demarkation of work and hence no contention on
index blocks.

ISTM this will require further work and it probably makes sense to mark the
patch as "Returned with feedback" for now.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-10 Thread Craig Ringer
On 3 Sep. 2016 9:22 pm, "Michael Paquier"  wrote:
>
> On Sat, Sep 3, 2016 at 12:42 AM, Magnus Hagander 
wrote:
> > On Fri, Sep 2, 2016 at 8:50 AM, Michael Paquier <
michael.paqu...@gmail.com>
> > wrote:
> >> On Fri, Sep 2, 2016 at 2:20 AM, Peter Eisentraut
> >>  wrote:
> >> > On 5/13/16 2:39 AM, Michael Paquier wrote:
> >> What do others think about that? I could implement that on top of 0002
> >> with some extra options. But to be honest that looks to be just some
> >> extra sugar for what is basically a bug fix... And I am feeling that
> >> providing such a switch to users would be a way for one to shoot
> >> himself badly, particularly for pg_receivexlog where a crash can cause
> >> segments to go missing.
> >>
> >
> > Well, why do we provide a --nosync option for initdb? Wouldn't the
argument
> > basically be the same?
>
> Yes, the good-for-testing-but-not-production argument.

We need it for tap tests. More and more will use pg_basebackup and it'll
start hurting test speeds badly.


Re: [HACKERS] Logical Replication WIP

2016-09-10 Thread Peter Eisentraut
Review of 0004-Make-libpqwalreceiver-reentrant.patch:

This looks like a good change.

typo: _PG_walreceirver_conn_init

For libpqrcv_create_slot(), slotname should be const char *.
Similarly, for slotname in libpqrcv_startstreaming*() and conninfo in
libpqrcv_connect(). (the latter two pre-existing)

The connection handle should record in libpqrcv_connect() whether a
connection is a logical or physical replication stream.  Then that
parameter doesn't have to be passed around later (or at least some
asserts could double-check it).

In libpqrcv_connect(), the new argument connname is actually just the
application name, for which in later patches the subscription name is
passed in.  Does this have a deeper meaning, or should we call the
argument appname to avoid introducing another term?

New function libpqrcv_create_slot(): Hardcoded cmd length (hmm, other
functions do that too), should used StringInfo.  ereport instead of
elog.  No newline at the end of error message, since PQerrorMessage()
already supplies it.  Typo "could not crate".  Briefly document return
value.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Allow to_date() and to_timestamp() to accept localized month names

2016-09-10 Thread Mattia
Hi,
attached is a patch which adds support to localized month names in
to_date() and to_timestamp() functions.

The patch is fairly simple but I want to discuss the approach and
implementation:

Using the TM modifier as in to_char() was already discussed some years
ago: 10710.1202170...@sss.pgh.pa.us [1]

I thought about reusing from_char_seq_search() but localized month
names use different capitalization according to the language grammar,
so I used pg_strncasecmp to do the match.

Regression tests with TM modifier are difficult since one should have
the locale used for the test installed on his system.

Usage example:
postgres=# set lc_time to 'fr_FR';
SET
postgres=# select to_date('22 janvier 2016', 'DD TMMonth ');
  to_date

 2016-01-22
(1 row)

[1] https://www.postgresql.org/message-id/10710.1202170898%40sss.pgh.pa.us

Thanks
Mattia


localized_month_names_v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-10 Thread Peter Eisentraut
On 9/3/16 9:23 AM, Michael Paquier wrote:
> On Sat, Sep 3, 2016 at 10:22 PM, Michael Paquier
>  wrote:
>> On Sat, Sep 3, 2016 at 10:21 PM, Michael Paquier
>>  wrote:
>>> Oh, well. I have just implemented it on top of the two other patches
>>> for pg_basebackup. For pg_receivexlog, I am wondering if it makes
>>> sense to have it. That would be trivial to implement it, and I think
>>> that we had better make the combination of --synchronous and --nosync
>>> just leave with an error. Thoughts about having that for
>>> pg_receivexlog?
>>
>> With patches that's actually better..
> 
> Meh, meh et meh.

In fsync_pgdata(), you have removed the progname from one error
message, even though it is passed into the function.  Also, I think
fsync_pgdata() should not be printing initdb's progress messages.
That should stay in initdb.  Worse, in 0002 you are then changing the
output, presumably to suit pg_basebackup or something else, which
messes up the initdb output.

durable_rename() does not fsync an existing newfile before renaming,
in contrast to the backend code in fd.c.

I'm slightly concerned about lots of code duplication in
durable_rename, fsync_parent_path between backend and standalone code.
Also, I'm equally slightly concerned that having duplicate function
names will mess up tag search for someone.

This is a preexisting mistake, but fsync_fname_ext() should just be
fsync_fname() to match the backend naming.  In the backend,
fsync_fname_ext() "extends" fsync_fname() by accepting an ignore_perm
argument, but the initdb code doesn't do that.

I don't think tar file output in pg_basebackup needs to be fsynced.
It's just a backup file like what pg_dump produces, and we don't fsync
that either.  The point of this change is to leave a data directory in
a synced state equivalent to what initdb leaves behind.

Some of the changes in receivelog.c seem excessive.  Replacing a plain
fsync() with fsync_fname_ext() plus fsync_parent_path() isn't
justified by the references you have provided.  This would need more
explanation.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Exclude additional directories in pg_basebackup

2016-09-10 Thread Michael Paquier
On Fri, Sep 9, 2016 at 10:18 PM, David Steele  wrote:
> On 9/6/16 10:25 PM, Michael Paquier wrote:
>>
>>
>> On Wed, Sep 7, 2016 at 12:16 AM, David Steele  wrote:
>>
>>> Attached is a new patch that adds sgml documentation.  I can expand on
>>> each
>>> directory individually if you think that's necessary, but thought it was
>>> better to lump them into a few categories.
>>
>>
>> +be ommitted from the backup as they will be initialized on postmaster
>> +startup. If the  is set and
>> is
>> +under the database cluster directory then the contents of the
>> directory
>> +specified by  can also
>> be ommitted.
>>
>> s/ommitted/omitted/
>
>
> Fixed.
>
>> +#define EXCLUDE_DIR_MAX 8
>> +#define EXCLUDE_DIR_STAT_TMP 0
>> +
>> +const char *excludeDirContents[EXCLUDE_DIR_MAX] =
>> +{
>> +/*
>> + * Skip temporary statistics files. The first array position will be
>> + * filled with the value of pgstat_stat_directory relative to PGDATA.
>> + * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is
>> set
>> + * because PGSS_TEXT_FILE is always created there.
>> + */
>> +NULL,
>> I find that ugly. I'd rather use an array with undefined size for the
>> fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and
>> EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on
>> _tarWriteHeader...
>
>
> Done.  Also writing out pg_xlog with the new routine.

Thanks, this looks in far better shape now.

+   /* Removed on startup, see DeleteAllExportedSnapshotFiles(). */
+   "pg_snapshots",
Using SNAPSHOT_EXPORT_DIR is possible here.

+_tarWriteDir(char *pathbuf, int basepathlen, struct stat *statbuf,
+bool sizeonly)
That's nice, thanks! This even takes care of the fact when directories
other than pg_xlog are symlinks or junction points.

+   /* Recreated on startup, see StartupReplicationSlots(). */
+   "pg_replslot",
This comment is incorrect, pg_replslot is skipped in a base backup
because that's just useless.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An extra error for client disconnection on Windows

2016-09-10 Thread Michael Paquier
On Fri, Sep 9, 2016 at 11:39 PM, Tom Lane  wrote:
> Haribabu Kommi  writes:
>> On Thu, Jun 2, 2016 at 6:51 PM, Kyotaro HORIGUCHI <
>> horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>> After a process termination without PQfinish() of a client,
>>> server emits the following log message not seen on Linux boxes.
>>> LOG:  could not receive data from client: An existing connection was
>>> forcibly closed by the remote host.
>>>
>>> This patch translates WSAECONNRESET of WSARecv to an EOF so that
>>> pgwin32_recv behaves the same way with Linux.
>
>> Marked the patch as "ready for committer".
>
> Windows is not my platform, but ... is this actually an improvement?
> I'm fairly concerned that this change would mask real errors that ought
> to get logged.  I don't know that that's an okay price to pay for
> suppressing a log message when clients violate the protocol.
>
> According to
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms740668(v=vs.85).aspx
> WSAECONNRESET means:
>
> An existing connection was forcibly closed by the remote host. This
> normally results if the peer application on the remote host is
> suddenly stopped, the host is rebooted, the host or remote network
> interface is disabled, or the remote host uses a hard close (see
> setsockopt for more information on the SO_LINGER option on the remote
> socket). This error may also result if a connection was broken due to
> keep-alive activity detecting a failure while one or more operations
> are in progress. Operations that were in progress fail with
> WSAENETRESET. Subsequent operations fail with WSAECONNRESET.
>
> (The description of WSAENETRESET, on the same page, indicates that the
> last two sentences apply only to the keep-alive failure case.)
>
> So this change would deal nicely with the "peer application on the remote
> host is suddenly stopped" case, at the price of being not nice about any
> of the other cases.  Not convinced it's a good tradeoff.

Yes, in the list of failure cases that could trigger this error, the
one that looks like a problem is to me is when a network interface is
disabled. It may be a good idea to let users know via the logs that
something was connected. Could we for example log a WARNING message,
and not report an error?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-10 Thread Michael Paquier
On Fri, Sep 9, 2016 at 4:01 PM, Kuntal Ghosh  wrote:
>>> - If WAL consistency check is enabled for a rmgrID, we always include
>>> the backup image in the WAL record.
>>
>> What happens if wal_consistency has different settings on a standby
>> and its master? If for example it is set to 'all' on the standby, and
>> 'none' on the master, or vice-versa, how do things react? An update of
>> this parameter should be WAL-logged, no?
>>
> It is possible to set wal_consistency to 'All' in master and any other
> values in standby. But, the scenario you mentioned will cause error in
> standby since it may not get the required backup image for wal
> consistency check. I think that user should be responsible to set
> this value correctly. We can improve the error message to make the
> user aware of the situation.

Let's be careful here. You should as well consider things from the
angle that some parameter updates are WAL-logged as well, like
wal_level with the WAL record XLOG_PARAMETER_CHANGE.

>>> - In recovery tests (src/test/recovery/t), I've added wal_consistency
>>> parameter in the existing scripts. This feature doesn't change the
>>> expected output. If there is any inconsistency, it can be verified in
>>> corresponding log file.
>>
>> I am afraid that just generating a WARNING message is going to be
>> useless for the buildfarm. If we want to detect errors, we could for
>> example have an additional GUC to trigger an ERROR or a FATAL, taking
>> down the cluster, and allowing things to show in red on a platform.
>>
> Yes, we can include an additional GUC to trigger an ERROR for any 
> inconsistency.

I'd like to hear extra opinions about that, but IMO just having an
ERROR would be fine for the first implementation. Once you've bumped
into an ERROR, you are likely going to fix it first.

>> +   /*
>> +* Followings are the rmids which can have backup blocks.
>> +* We'll enable this feature only for these rmids.
>> +*/
>> +   newwalconsistency[RM_HEAP2_ID] = true;
>> +   newwalconsistency[RM_HEAP_ID] = true;
>> +   newwalconsistency[RM_BTREE_ID] = true;
>> +   newwalconsistency[RM_HASH_ID] = true;
>> +   newwalconsistency[RM_GIN_ID] = true;
>> +   newwalconsistency[RM_GIST_ID] = true;
>> +   newwalconsistency[RM_SEQ_ID] = true;
>> +   newwalconsistency[RM_SPGIST_ID] = true;
>> +   newwalconsistency[RM_BRIN_ID] = true;
>> +   newwalconsistency[RM_GENERIC_ID] = true;
>> +   newwalconsistency[RM_XLOG_ID] = true;
>> Here you can just use MemSet with RM_MAX_ID and simplify this code 
>> maintenance.
>>
> Not all rmids can have backup blocks. So, for wal_consistency = 'all',
> I've enabled only those rmids which can have backup blocks.

Even if some rmgrs do not support FPWs, I don't think that it is safe
to assume that the existing ones would never support it. Imagine for
example that feature X is implemented. Feature X adds rmgs Y, but rmgr
Y does not use FPWs. At a later point a new feature is added, which
makes rmgr Y using FPWs. We'd increase the number of places to update
with your patch, increasing the likelyness to introduce bugs. It would
be better to use a safe implementation from the maintenance point of
view to be honest (maintenance load of masking functions is somewhat
leveraged by the fact that on-disk format is kept compatible).

>>
>> +   if (pg_strcasecmp(tok, "Heap2") == 0)
>> +   {
>> +   newwalconsistency[RM_HEAP2_ID] = true;
>> +   }
>> Thinking more about it, I guess that we had better change the
>> definition list of rmgrs in rmgr.h and get something closer to
>> RmgrDescData that pg_xlogdump has to avoid all this stanza by
>> completing it with the name of the rmgr. The only special cases that
>> this code path would need to take care of would be then 'none' and
>> 'all'. You could do this refactoring on top of the main patch to
>> simplify it as it is rather big (1.7k lines).
>>
> I'm not sure about this point. wal_consistency doesn't support  all
> the rmids. We should have some way to check this.

I'd rather see this code done in such a way that all the rmgrs can be
handled, this approach being particularly attractive for the fact that
there is no need to change it if new rmgrs are added in the future.
(This was a reason as well why I still think that a simple on/off
switch would be plain enough, users have mostly control of the SQLs
triggering WAL. And if you run tests, you'll likely have the mind to
turn autovacuum to off to avoid it to generate FPWs and pollute the
logs at least at the second run of your tests).

And if you move forward with the approach of making this parameter a
list, I think that it would be better to add a section in the WAL
documentation about resource managers, like what they are, and list
them in this section of the docs. Then your parameter could link to
this documentation part and 

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-10 Thread Heikki Linnakangas

On 09/10/2016 04:21 AM, Claudio Freire wrote:

On Fri, Sep 9, 2016 at 9:51 PM, Claudio Freire  wrote:

On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas  wrote:


Claudio, if you could also repeat the tests you ran on Peter's patch set on
the other thread, with these patches, that'd be nice. These patches are
effectively a replacement for
0002-Use-tuplesort-batch-memory-for-randomAccess-sorts.patch. And review
would be much appreciated too, of course.

Attached are new versions. Compared to last set, they contain a few comment
fixes, and a change to the 2nd patch to not allocate tape buffers for tapes
that were completely unused.


Will do so


Thanks!


It seems both 1 and 1+2 break make check.


Oh. Works for me. What's the failure you're getting?


Did I misunderstand something? I'm applying the first patch of Peter's
series (cap number of tapes), then your first one (remove prefetch)
and second one (use larger read buffers).


Yes. I have been testing without Peter's first patch, with just the two 
patches I posted. But it should work together (and does, I just tested) 
with that one as well.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-10 Thread Michael Paquier
On Thu, Sep 8, 2016 at 10:04 PM, Tom Lane  wrote:
> The core code has never used xslt at all.

Yes.

> Some quick digging in the git
> history suggests that contrib/xml2 wasn't very clean about this before
> 2008:
> [...]
> Both of those fixes postdate our native-Windows port, though I'm not sure
> of the origin of the specific test you're wondering about.

Thanks. I was a bit too lazy to look at the history to get this piece
of history out... And so the code that is presently in the MSVC
scripts should have been updated at the same time as those
compilations have been relaxed, but it got forgotten on the way I
guess. Then it seemt to me that the attached patch would do things as
they should be done: removal of the condition for iconv, which is an
optional flag for libxml2, and reversing the check for libxslt <->
libxml2.
-- 
Michael


msvc_xml_relax_v2.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY command with RLS bug

2016-09-10 Thread Michael Paquier
On Sat, Sep 10, 2016 at 3:55 AM, Adam Brightwell
 wrote:
>> Perhaps we should extend rowsecurity test with a more comprehensive
>> set of tests rather than just fix the COPY one?
>
> I think more tests that provide value are always a *good* thing,
> however, would argue that other tests 'unrelated' to this fix are more
> of a TODO item than something to include with this fix. Though, I am
> certainly willing to attempt to find/add more test cases around this
> specific functionality if that is desired.

Looking for and improving test coverage for RLS is a good suggestion,
but let's not link the fate of the issue reported here with this
requirement. I have spent some time looking at this patch and this
looks in rather good shape to me (you even remembered to use the
prefix regress_* for the role name that you are adding!). So I have
marked this bug fix as ready for committer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers