date:20160911

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Pavel Stehule

2016-09-12 6:36 GMT+02:00 Craig Ringer :

> On 12 September 2016 at 12:28, Craig Ringer  wrote:
> >> I'll take a closer read-through shortly.
>
> >DEFAULT
> > isn't a normal literal, it's an xpath expression evaluated at the same
> > time as the rowexpression.
>
> Sorry for the spam, but turns out that's not the case as implemented
> here. The docs you referenced say it should be an xpath expression,
> but the implementation here is of a literal value, and examples
> elsewhere on the Internet show a literal value. Unclear if the
> referenced docs are wrong or what and I don't have anything to test
> with.
>
> Feel free to fix/trim the DEFAULT related changes in above docs patch as
> needed.
>
> Also, tests/docs should probably cover what happens when PATH matches
> more than one element, i.e. produces a list of more than one match.
>

It is there for case, when this is allowed. When you change the  target
type to any non XML type, then a error is raised.

I didn't write a negative test cases until the text of messages will be
final (or checked by native speaker).

Regards

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Pavel Stehule

2016-09-12 6:28 GMT+02:00 Craig Ringer :

> > I'll take a closer read-through shortly.
>
>
> Missing file. You omitted executor/tableexpr.h from the patch, so I
> can't compile.
>
> I've expanded and copy-edited the docs. Some of it is guesswork based
> on the references you sent and a glance at the code. Please check my
> changes carefully. I found a few surprises, like the fact that DEFAULT
> isn't a normal literal, it's an xpath expression evaluated at the same
> time as the rowexpression.
>
> Updated patch attached as XMLTABLE-v3 includes the docs changes. Note
> that it's missing tableexpr.h. For convenient review or to apply to
> your working tree I also attach a diff of just my docs changes as
> proposed-docs-changes.diff.
>
>
> Docs:
>
> - Can you send the sample data used to generate the example output?
> I'd like to include at least a cut down part of it in the docs to make
> it clear how the input correlates with output, and preferably put the
> whole thing in an appendix.
>

it is in regress tests.


>
> - How does it decide what subset of the document to iterate over?
> That's presumably rowexpr, which is xpath in postgresql? (I added this
> to docs).
>
> - xmlnamespaces clause in docs needs an example for a non-default
> namespace.
>
> - What effect does xmlnamespaces clause have? Does supplying it allow
> you to reference qualified names in xpath? What happens if you don't
> specify it for a document that has namespaces or don't define all the
> namespaces? What if you reference an undefined namespace in xpath?
> What about if an undefined namespace isn't referenced by xpath, but is
> inside a node selected by an xpath expression?
>

All this is under libxml2 control - when you use undefined namespace, then
libxml2 raises a error. The namespaces in document and in XPath queries are
absolutely independent - the relation is a URI. When you use bad URI
(referenced by name), then the result will be empty set. When you use
undefined name, then you will get a error.


>
> - What are the rules for converting the matched XML node into a value?
> If the matched node is not a simple text node or lacks a text node as
> its single child, what happens?
>

This process is described and controlled by "XML SQL mapping". The Postgres
has minimalistic implementation without possibility of external control and
without schema support. The my implementation is simple. When user doesn't
specify result target like explicit using of text() function, then the
text() function is used implicitly when target type is not XML. Then I dump
result to string and I enforce related input functions for target types.



>
> - What happens if the matched has multiple text node children? This
> can happen if, for example, you have something like
>
> 
>   some text  other text
> 
>

depends on target type - it is allowed in XML, and it is disallowed for
other types. I though about support of a arrays - but the patch will be
much more complex - there can be recursion - so I disallowed it. When the
user have to solve this issue, then he can use nested XMLTABLE functions
and nested function is working with XML type.

Just for record - This issue is solved in JSON_TABLE functions - it allows
nested PATHS. But XMLTABLE doesn't allow it.


>
> - Is there a way to get an attribute as a value? If so, an example
> should show this because it's going to be a common need. Presumably
> you want   node/@attrname ?
>

you can use reference to current node "." - so "./@attname" should to work
- a example is in regress tests



>
> - What happens if COLUMNS is not specified at all? It looks like it
> returns a single column result set with the matched entries as 'xml'
> type, so added to docs, please verify.
>

sure, that is it


>
>
> - PASSING clause isn't really defined. You can specify one PASSING
> entry as a literal/colref/expression, and it's the argument xml
> document, right? The external docs you referred to say that PASSING
> may have a BY VALUE keyword, alias its argument with AS, and may have
> expressions, e.g.
>
> PASSING BY VALUE '' AS a, '' AS b
>
>   Neither aliases nor multiple entries are supported by the code or
> grammar. Should this be documented as a restriction? Do you know if
> that's an extension by the other implementation or if it's SQL/XML
> standard? (I've drafted a docs update to cover this in the updated
> patch).
>

The ANSI allows to pass more documents - and then do complex queries with
XQuery. Passing more than one document has not sense in libxml2 based
implementation, so I didn't supported it. The referenced names can be
implemented later - but it needs to changes in XPATH function too.


>
>
> - What does BY REF mean? Should this just be mentioned with a "see
> xmlexists(...)" since it seems to be compatibility noise? Is there a
> corresponding BY VALUE or similar?
>

When the XML document is stored as serialized DOM, then by ref means link
on this DOM. It has not

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Pavel Stehule

2016-09-12 6:36 GMT+02:00 Craig Ringer :

> On 12 September 2016 at 12:28, Craig Ringer  wrote:
> >> I'll take a closer read-through shortly.
>
> >DEFAULT
> > isn't a normal literal, it's an xpath expression evaluated at the same
> > time as the rowexpression.
>
> Sorry for the spam, but turns out that's not the case as implemented
> here. The docs you referenced say it should be an xpath expression,
> but the implementation here is of a literal value, and examples
> elsewhere on the Internet show a literal value. Unclear if the
> referenced docs are wrong or what and I don't have anything to test
> with.
>

It  is not spam. The previous comment was not correct. DEFAULT is a
expression - result of this expression is used, when data is missing.

In standard, and some others implementation, this is literal only. It is
similar to DEFAULT clause in CREATE STATEMENT. Postgres allows expression
there. Usually Postgres allows expressions everywhere when it has sense,
and when it is allowed by bizon parser.



>
> Feel free to fix/trim the DEFAULT related changes in above docs patch as
> needed.
>
> Also, tests/docs should probably cover what happens when PATH matches
> more than one element, i.e. produces a list of more than one match.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Craig Ringer

On 12 September 2016 at 12:28, Craig Ringer  wrote:
>> I'll take a closer read-through shortly.

>DEFAULT
> isn't a normal literal, it's an xpath expression evaluated at the same
> time as the rowexpression.

Sorry for the spam, but turns out that's not the case as implemented
here. The docs you referenced say it should be an xpath expression,
but the implementation here is of a literal value, and examples
elsewhere on the Internet show a literal value. Unclear if the
referenced docs are wrong or what and I don't have anything to test
with.

Feel free to fix/trim the DEFAULT related changes in above docs patch as needed.

Also, tests/docs should probably cover what happens when PATH matches
more than one element, i.e. produces a list of more than one match.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Craig Ringer

> I'll take a closer read-through shortly.


Missing file. You omitted executor/tableexpr.h from the patch, so I
can't compile.

I've expanded and copy-edited the docs. Some of it is guesswork based
on the references you sent and a glance at the code. Please check my
changes carefully. I found a few surprises, like the fact that DEFAULT
isn't a normal literal, it's an xpath expression evaluated at the same
time as the rowexpression.

Updated patch attached as XMLTABLE-v3 includes the docs changes. Note
that it's missing tableexpr.h. For convenient review or to apply to
your working tree I also attach a diff of just my docs changes as
proposed-docs-changes.diff.


Docs:

- Can you send the sample data used to generate the example output?
I'd like to include at least a cut down part of it in the docs to make
it clear how the input correlates with output, and preferably put the
whole thing in an appendix.

- How does it decide what subset of the document to iterate over?
That's presumably rowexpr, which is xpath in postgresql? (I added this
to docs).

- xmlnamespaces clause in docs needs an example for a non-default namespace.

- What effect does xmlnamespaces clause have? Does supplying it allow
you to reference qualified names in xpath? What happens if you don't
specify it for a document that has namespaces or don't define all the
namespaces? What if you reference an undefined namespace in xpath?
What about if an undefined namespace isn't referenced by xpath, but is
inside a node selected by an xpath expression?

- What are the rules for converting the matched XML node into a value?
If the matched node is not a simple text node or lacks a text node as
its single child, what happens?

- What happens if the matched has multiple text node children? This
can happen if, for example, you have something like


  some text  other text


- Is there a way to get an attribute as a value? If so, an example
should show this because it's going to be a common need. Presumably
you want   node/@attrname ?

- What happens if COLUMNS is not specified at all? It looks like it
returns a single column result set with the matched entries as 'xml'
type, so added to docs, please verify.


- PASSING clause isn't really defined. You can specify one PASSING
entry as a literal/colref/expression, and it's the argument xml
document, right? The external docs you referred to say that PASSING
may have a BY VALUE keyword, alias its argument with AS, and may have
expressions, e.g.

PASSING BY VALUE '' AS a, '' AS b

  Neither aliases nor multiple entries are supported by the code or
grammar. Should this be documented as a restriction? Do you know if
that's an extension by the other implementation or if it's SQL/XML
standard? (I've drafted a docs update to cover this in the updated
patch).


- What does BY REF mean? Should this just be mentioned with a "see
xmlexists(...)" since it seems to be compatibility noise? Is there a
corresponding BY VALUE or similar?


- The parser definitions re-use xmlexists_argument . I don't mind
that, but it's worth noting here in case others do.


- Why do the expression arguments take c_expr (anything allowed in
a_expr or b_expr), not b_expr (restricted expression) ?


- Column definitions are underdocumented. The grammar says they can be
NOT NULL, for example, but I don't see that in any of the references
you mailed me nor in the docs. What behaviour is expected for a NOT
NULL column? I've documented my best guess (not checked closely
against the code, please verify).

-




Test suggestions:

- Coverage of multiple text() node children of an element, where split
up by comment or similar

- Coverage of xpath that matches a node with child element nodes



More to come. Please review my docs changes in the mean time. I'm
spending a lot more time on this than I expected so I might have to
get onto other things for a while too.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 464e8f16386c0fbfb8b5c6fdee66402c95c6ae9f Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 12 Sep 2016 12:26:14 +0800
Subject: [PATCH] Proposed docs changes

---
 doc/src/sgml/func.sgml | 103 +++--
 1 file changed, 82 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ca861d2..3ba492e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10099,7 +10099,7 @@ SELECT xmlroot(xmlparse(document 'abc'),
 

 
-   
+   
 xmltable
 

@@ -10111,7 +10111,9 @@ SELECT xmlroot(xmlparse(document 'abc'),
 
 
 
-  The xmltable produces table based on passed XML value.
+  The xmltable produces a table based on the
+  passed XML value, an xpath filter to extract rows, and an
+  optional set of column definitions.
 
 
 
@@ -10140,11 +10142,11 @@ SELECT  xmltable.*
 
 

- The

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Vitaly Burovoy

On 9/11/16, Tom Lane  wrote:
> Vitaly Burovoy  writes:
>> On 9/11/16, Kevin Grittner  wrote:
>>> I was able to find cases during test which were not handled
>>> correctly with either version, so I tweaked the query a little.
>
>> Hmm. Which one? Attempt to "SET ROLE "?
>> Unfortunately, I after reading your letter I realized that I missed a
>> case (it is not working even with your version):
>
> I wasn't aware that this patch was doing anything nontrivial ...
>
> After looking at it I think it's basically uninformed about how to test
> for ownership.  An explicit join against pg_roles is almost never the
> right way for SQL queries to do that.  Lose the join and write it more
> like this:
>
> +"SELECT pg_catalog.quote_ident(d.datname) "\
> +"  FROM pg_catalog.pg_database d "\
> +" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\
> +"   AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))"
>
> See the information_schema views for precedent.
>
>   regards, tom lane

Wow! I have not pay enough attention to a description of "pg_has_role".
Your version works for all my tests. Thank you.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-11 Thread Amit Kapila

On Sun, Sep 11, 2016 at 3:01 PM, Mark Kirkwood
 wrote:
> On 11/09/16 19:16, Mark Kirkwood wrote:
>
>>
>>
>> On 11/09/16 17:01, Amit Kapila wrote:
>>>
>>> ...Do you think we can do some read-only
>>> workload benchmarking using this server?  If yes, then probably you
>>> can use concurrent hash index patch [1] and cache the metapage patch
>>> [2] (I think Mithun needs to rebase his patch) to do so.
>>>
>>>
>>>
>>> [1] -
>>> https://www.postgresql.org/message-id/caa4ek1j6b8o4pcepqrxnyblvbftonmjeem+qn0jzx31-obx...@mail.gmail.com
>>> [2] -
>>> https://www.postgresql.org/message-id/cad__ouhj29cebif_flge4t9vj_-cfxbwcxhjo+d_16txbem...@mail.gmail.com
>>>
>>>
>>
>> I can do - are we checking checking for hangs/assertions or comparing
>> patched vs unpatched performance (for the metapage patch)?
>>
>>

The first step is to find any bugs and then do performance testing.
What I wanted for performance testing was to compare HEAD against all
three patches (all three together) of Hash Index (concurrent hash
index, cache the meta page, WAL for hash index).  For Head, we want
two set of numbers, one with hash indexes and another with btree
indexes.  As Jeff has found few problems, I think it is better to
first fix those before going for performance tests.

>
> So, assuming the latter - testing performance with and without the metapage
> patch:
>
> For my 1st runs:
>
> - cpus 16, ran 16G
> - size 100, clients 32
>
> I'm seeing no difference in performance for read only (-S) pgbench workload
> (with everybody using has indexes). I guess not that surprising as the db
> fites in ram (1.6G and we have 16G). So I'll retry with a bigger dataset
> (suspect size 2000 is needed).
>

I think you should try with -S -M prepared and with various client
counts (8,16,32,64,100...).


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Michael Paquier

On Mon, Sep 12, 2016 at 11:38 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Indeed, and the query field does not have much more meaning for a WAL
>> sender. So I'd propose the attached for 9.6 and HEAD. I have thought
>> about reporting that to pgstat in StartReplication(), but as there is
>> some error handling there I'd think that WalSndLoop() is a better
>> place to call pgstat_report_activity, as per the attached.
>
> As long as it's a fixed string there's no reason to set it repeatedly,
> so this placement looks fine for now.  We can reconsider when/if we
> make it variable and decide what is going to drive it.
>
> On reflection, maybe s/walsender/WAL sender/?  It doesn't look like
> we really use the word "walsender" in user-facing docs.

Indeed, that may be better for clarity.

Except from the release notes, walsender is mentioned a couple of
times in the protocol docs, as *walsender mode*:
src/sgml/high-availability.sgml:a corresponding walsender process
in the primary.
src/sgml/protocol.sgml:Copy-both mode is initiated when a backend
in walsender mode
src/sgml/protocol.sgml:of true tells the backend to go
into walsender mode, wherein a
src/sgml/protocol.sgml:the simple query protocol can be used in walsender mode.
src/sgml/protocol.sgml:Passing database as the value
instructs walsender to connect to
src/sgml/protocol.sgml:The commands accepted in walsender mode are:
So that looked adapted to me at first sight. Actually, the text had
better be "WAL sender" and not "walsender" in high-availability, no?
That refers to the process, and not the backend mode.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-11 Thread Amit Kapila

On Mon, Sep 12, 2016 at 7:00 AM, Jeff Janes  wrote:
> On Thu, Sep 8, 2016 at 12:09 PM, Jeff Janes  wrote:
>
>>
>> I plan to do testing using my own testing harness after changing it to
>> insert a lot of dummy tuples (ones with negative values in the pseudo-pk
>> column, which are never queried by the core part of the harness) and
>> deleting them at random intervals.  I think that none of pgbench's built in
>> tests are likely to give the bucket splitting and squeezing code very much
>> exercise.
>
>
>
> I've implemented this, by adding lines 197 through 202 to the count.pl
> script.  (I'm reattaching the test case)
>
> Within a few minutes of testing, I start getting Errors like these:
>
> 29236 UPDATE XX000 2016-09-11 17:21:25.893 PDT:ERROR:  buffer 2762 is not
> owned by resource owner Portal
> 29236 UPDATE XX000 2016-09-11 17:21:25.893 PDT:STATEMENT:  update foo set
> count=count+1 where index=$1
>
>
> In one test, I also got an error from my test harness itself indicating
> tuples are transiently missing from the index, starting an hour into a test:
>
> child abnormal exit update did not update 1 row: key 9555 updated 0E0 at
> count.pl line 194.\n  at count.pl line 208.
> child abnormal exit update did not update 1 row: key 8870 updated 0E0 at
> count.pl line 194.\n  at count.pl line 208.
> child abnormal exit update did not update 1 row: key 8453 updated 0E0 at
> count.pl line 194.\n  at count.pl line 208.
>
> Those key values should always find exactly one row to update.
>
> If the tuples were permanently missing from the index, I would keep getting
> errors on the same key values very frequently.  But I don't get that, the
> errors remain infrequent and are on different value each time, so I think
> the tuples are in the index but the scan somehow misses them, either while
> the bucket is being split or while it is being squeezed.
>
> This on a build without enable-asserts.
>
> Any ideas on how best to go about investigating this?
>

I think these symptoms indicate the bug in concurrent hash index
patch, but it could be that the problem can be only revealed with WAL
patch.  Is it possible to just try this with concurrent hash index
patch?  In any case, thanks for testing it, I will look into these
issues.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-09-11 Thread Michael Paquier

On Mon, Sep 12, 2016 at 10:01 AM, Noah Misch  wrote:
> I discourage documenting LF/CR restrictions.  For the epsilon of readers who
> would benefit from this knowledge, the error message suffices.  For everyone
> else, it would just dilute the text.  (One could argue against other parts of
> our documentation on this basis, but I'm not calling for such a study.  I'm
> just saying that today's lack of documentation on this topic is optimal.)

Okay.

>> > > I think the way forward here, if any, is to work on removing these
>> > > restrictions, not to keep sprinkling them around.
>> >
>> > If we were talking about pathnames containing spaces, I would agree,
>> > but I've never heard of a legitimate pathname containing CR or LF.  I
>> > can't see us losing much by refusing to allow such pathnames, except
>> > for security holes.
>
> (In other words, +1 to that.)

Yep. To be honest, I see value in preventing directly the use of
newlines and carriage returns in database and role names to avoid
users to be bitten by custom scripts, things for example written in
bash that scan manually for pg_database, pg_roles, etc. And I have
seen such things over the years. Now it is true that the safeguards in
core are proving to be enough, if only the in-core tools are used, but
that's not necessarily the case with all the things gravitating around
this community.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Tom Lane

Michael Paquier  writes:
> Indeed, and the query field does not have much more meaning for a WAL
> sender. So I'd propose the attached for 9.6 and HEAD. I have thought
> about reporting that to pgstat in StartReplication(), but as there is
> some error handling there I'd think that WalSndLoop() is a better
> place to call pgstat_report_activity, as per the attached.

As long as it's a fixed string there's no reason to set it repeatedly,
so this placement looks fine for now.  We can reconsider when/if we
make it variable and decide what is going to drive it.

On reflection, maybe s/walsender/WAL sender/?  It doesn't look like
we really use the word "walsender" in user-facing docs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Michael Paquier

On Mon, Sep 12, 2016 at 10:16 AM, Stephen Frost  wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Michael Paquier  writes:
>> > On Sun, Sep 11, 2016 at 5:21 AM, Tom Lane  wrote:
>> >> The fact that the pg_stat_replication view does show walsender processes
>> >> seems like possibly a reasonable argument for not showing them in
>> >> pg_stat_activity.  But we'd have to do some rejiggering of the view
>> >> definition to make that happen.
>>
>> > We may actually had better show WAL sender processes in
>> > pg_stat_activity. An argument in favor of that is the tracking of
>> > WaitEventSet events (or latches if you want).
>>
>> Also, walsenders count against MaxBackends don't they?  So not showing
>> them could contribute to confusion about why an installation is hitting
>> the connection limit.

Yes they are counted in MaxBackends. So that would help as well in
looking at connection limit issues.

>> If we do keep them in the view, I would definitely vote for having them
>> set their "query" fields to something that shows they're walsenders.
>> It's awfully late to be doing anything complicated there for 9.6,
>> but we could just set the query to "walsender" and plan to improve
>> on that in future releases.
>
> +1

Indeed, and the query field does not have much more meaning for a WAL
sender. So I'd propose the attached for 9.6 and HEAD. I have thought
about reporting that to pgstat in StartReplication(), but as there is
some error handling there I'd think that WalSndLoop() is a better
place to call pgstat_report_activity, as per the attached.

We could have far more fancy verbose information to offer to the user,
but that's another story..
-- 
Michael
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ea2a5c..c7743da 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1806,6 +1806,9 @@ WalSndLoop(WalSndSendDataCallback send_data)
 	last_reply_timestamp = GetCurrentTimestamp();
 	waiting_for_ping_response = false;
 
+	/* Report to pgstat that this process is a WAL sender */
+	pgstat_report_activity(STATE_RUNNING, "walsender");
+
 	/*
 	 * Loop until we reach the end of this timeline or the client requests to
 	 * stop streaming.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-09-11 Thread Craig Ringer

On 9 September 2016 at 21:44, Pavel Stehule  wrote:
>
>
> 2016-09-09 10:35 GMT+02:00 Pavel Stehule :
>>
>> Hi,
>>
>> I am sending new version of this patch
>>
>> 1. now generic TableExpr is better separated from a real content
>> generation
>> 2. I removed cached typmod - using row type cache everywhere - it is
>> consistent with other few places in Pg where dynamic types are used - the
>> result tupdesc is generated few times more - but it is not on critical path.
>> 3. More comments, few more lines in doc.
>> 4. Reformated by pgindent

Thanks.

I applied this on top of the same base as your prior patch so I could
compare changes.

The new docs look good. Thanks for that, I know it's a pain. It'll
need to cover ORDINAL too, but that's not hard. I'll try to find some
time to help with the docs per the references you sent offlist.

Out of interest, should the syntax allow room for future expansion to
permit reading from file rather than just string literal / column
reference? It'd be ideal to avoid reading big documents wholly into
memory when using INSERT INTO ... SELECT XMLTABLE (...) . I don't
suggest adding that to this patch, just making sure adding it later
would not cause problems.

I see you added a builder context abstraction as discussed, so there's
no longer any direct reference to XMLTABLE specifics from TableExpr
code. Good, thanks for that. It'll make things much less messy when
adding other table expression types as you expressed the desire to do,
and means the TableExpr code now makes more sense as generic
infrastructure.

ExecEvalTableExprProtected and ExecEvalTableExpr are OK with me, or
better than execEvalTableExpr and ExecEvalTableExpr were anyway.
Eventual committer will probably have opinions here.

Mild nitpick: since you can have multiple namespaces, shouldn't
builder->SetNS be builder->AddNS ?

Added comments are helpful, thanks.

On first read-through this is a big improvement and addresses all the
concerns I raised. Documentation is much much better, thanks, I know
that's a pain.

I'll take a closer read-through shortly.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-11 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Sep 8, 2016 at 5:21 PM, Tom Lane  wrote:
> > Stephen Frost  writes:
> >> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> >>> Can't you keep those words as Sconst or something (DefElems?) until the
> >>> execution phase, so that they don't need to be keywords at all?
> >
> >> Seems like we could do that, though I'm not convinced that it really
> >> gains us all that much.  These are only unreserved keywords, of course,
> >> so they don't impact users the way reserved keywords (of any kind) can.
> >> While there may be some places where we use a string to represent a set
> >> of defined options, I don't believe that's typical
> >
> > -1 for having to write them as string literals; but I think what Alvaro
> > really means is to arrange for the words to just be identifiers in the
> > grammar, which you strcmp against at execution.  See for example
> > reloption_list.  (Whether you use DefElem as the internal representation
> > is a minor detail, though it might help for making the parsetree
> > copyObject-friendly.)
> >
> > vacuum_option_elem shows another way to avoid making a word into a
> > keyword, although to me that one is more of an antipattern; it'd be better
> > to leave the strcmp to execution, since there's so much other code that
> > does things that way.
> 
> There are other cases like that, too, e.g. AlterOptRoleElem; I don't
> think it's a bad pattern.  Regardless of the specifics, I do think
> that it would be better not to bloat the keyword table with things
> that don't really need to be keywords.

The AlterOptRoleElem case is, I believe, what Tom was just suggesting as
an antipattern, since the strcmp() is being done at parse time instead
of at execution time.

If we are concerned about having too many unreserved keywords, then I
agree that AlterOptRoleElem is a good candidate to look at for reducing
the number we have, as it appears to contain 3 keywords which are not
used anywhere else (and 1 other which is only used in one other place).

I do think that using IDENT for the various role attributes does make
sense, to be clear, as there are quite a few of them, they change
depending on major version, and those keywords are very unlikely to ever
have utilization elsewhere.

For this case, there's just 2 keywords which seem like they may be used
again (perhaps for ALTER or DROP POLICY, or default policies if we grow
those in the future), and we're unlikly to grow more in the future for
that particular case (as we only have two binary boolean operators and
that seems unlikely to change), though, should that happens, we could
certainly revisit this.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-11 Thread Robert Haas

On Thu, Sep 8, 2016 at 5:21 PM, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
>>> Can't you keep those words as Sconst or something (DefElems?) until the
>>> execution phase, so that they don't need to be keywords at all?
>
>> Seems like we could do that, though I'm not convinced that it really
>> gains us all that much.  These are only unreserved keywords, of course,
>> so they don't impact users the way reserved keywords (of any kind) can.
>> While there may be some places where we use a string to represent a set
>> of defined options, I don't believe that's typical
>
> -1 for having to write them as string literals; but I think what Alvaro
> really means is to arrange for the words to just be identifiers in the
> grammar, which you strcmp against at execution.  See for example
> reloption_list.  (Whether you use DefElem as the internal representation
> is a minor detail, though it might help for making the parsetree
> copyObject-friendly.)
>
> vacuum_option_elem shows another way to avoid making a word into a
> keyword, although to me that one is more of an antipattern; it'd be better
> to leave the strcmp to execution, since there's so much other code that
> does things that way.

There are other cases like that, too, e.g. AlterOptRoleElem; I don't
think it's a bad pattern.  Regardless of the specifics, I do think
that it would be better not to bloat the keyword table with things
that don't really need to be keywords.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-11 Thread Jeff Janes

On Thu, Sep 8, 2016 at 12:09 PM, Jeff Janes  wrote:

> I plan to do testing using my own testing harness after changing it to
> insert a lot of dummy tuples (ones with negative values in the pseudo-pk
> column, which are never queried by the core part of the harness) and
> deleting them at random intervals.  I think that none of pgbench's built in
> tests are likely to give the bucket splitting and squeezing code very much
> exercise.
>

I've implemented this, by adding lines 197 through 202 to the count.pl
script.  (I'm reattaching the test case)

Within a few minutes of testing, I start getting Errors like these:

29236 UPDATE XX000 2016-09-11 17:21:25.893 PDT:ERROR:  buffer 2762 is not
owned by resource owner Portal
29236 UPDATE XX000 2016-09-11 17:21:25.893 PDT:STATEMENT:  update foo set
count=count+1 where index=$1

In one test, I also got an error from my test harness itself indicating
tuples are transiently missing from the index, starting an hour into a test:

child abnormal exit update did not update 1 row: key 9555 updated 0E0 at
count.pl line 194.\n  at count.pl line 208.
child abnormal exit update did not update 1 row: key 8870 updated 0E0 at
count.pl line 194.\n  at count.pl line 208.
child abnormal exit update did not update 1 row: key 8453 updated 0E0 at
count.pl line 194.\n  at count.pl line 208.

Those key values should always find exactly one row to update.

If the tuples were permanently missing from the index, I would keep getting
errors on the same key values very frequently.  But I don't get that, the
errors remain infrequent and are on different value each time, so I think
the tuples are in the index but the scan somehow misses them, either while
the bucket is being split or while it is being squeezed.

This on a build without enable-asserts.

Any ideas on how best to go about investigating this?

Cheers,

Jeff

count.pl
Description: Binary data

crash_REL10.patch
Description: Binary data

do.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Michael Paquier  writes:
> > On Sun, Sep 11, 2016 at 5:21 AM, Tom Lane  wrote:
> >> The fact that the pg_stat_replication view does show walsender processes
> >> seems like possibly a reasonable argument for not showing them in
> >> pg_stat_activity.  But we'd have to do some rejiggering of the view
> >> definition to make that happen.
> 
> > We may actually had better show WAL sender processes in
> > pg_stat_activity. An argument in favor of that is the tracking of
> > WaitEventSet events (or latches if you want).
> 
> Also, walsenders count against MaxBackends don't they?  So not showing
> them could contribute to confusion about why an installation is hitting
> the connection limit.
> 
> If we do keep them in the view, I would definitely vote for having them
> set their "query" fields to something that shows they're walsenders.
> It's awfully late to be doing anything complicated there for 9.6,
> but we could just set the query to "walsender" and plan to improve
> on that in future releases.

+1

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Tom Lane

Michael Paquier  writes:
> On Sun, Sep 11, 2016 at 5:21 AM, Tom Lane  wrote:
>> The fact that the pg_stat_replication view does show walsender processes
>> seems like possibly a reasonable argument for not showing them in
>> pg_stat_activity.  But we'd have to do some rejiggering of the view
>> definition to make that happen.

> We may actually had better show WAL sender processes in
> pg_stat_activity. An argument in favor of that is the tracking of
> WaitEventSet events (or latches if you want).

Also, walsenders count against MaxBackends don't they?  So not showing
them could contribute to confusion about why an installation is hitting
the connection limit.

If we do keep them in the view, I would definitely vote for having them
set their "query" fields to something that shows they're walsenders.
It's awfully late to be doing anything complicated there for 9.6,
but we could just set the query to "walsender" and plan to improve
on that in future releases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Tom Lane

Vitaly Burovoy  writes:
> On 9/11/16, Kevin Grittner  wrote:
>> I was able to find cases during test which were not handled
>> correctly with either version, so I tweaked the query a little.

> Hmm. Which one? Attempt to "SET ROLE "?
> Unfortunately, I after reading your letter I realized that I missed a
> case (it is not working even with your version):

I wasn't aware that this patch was doing anything nontrivial ...

After looking at it I think it's basically uninformed about how to test
for ownership.  An explicit join against pg_roles is almost never the
right way for SQL queries to do that.  Lose the join and write it more
like this:

+"SELECT pg_catalog.quote_ident(d.datname) "\
+"  FROM pg_catalog.pg_database d "\
+" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\
+"   AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))"

See the information_schema views for precedent.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-09-11 Thread Noah Misch

On Thu, Sep 08, 2016 at 12:12:49PM -0400, Peter Eisentraut wrote:
> On 9/6/16 1:42 PM, Robert Haas wrote:
> > On Tue, Sep 6, 2016 at 9:43 PM, Peter Eisentraut 
> >  wrote:
> > > Everything that is using appendShellString() is now going to reject LF
> > > and CR characters, but there is no systematic way by which this is
> > > managed, enforced, or documented.

I agree, and I think that's roughly what it deserves.  Longstanding use of
MAXPGPATH is worse; we truncate silently and proceed to whatever malfunction
that entails.  We do not see complaints about it.  To make MAXPGPATH or LF/CR
better-managed would nominally improve PostgreSQL, but the value is low.

I discourage documenting LF/CR restrictions.  For the epsilon of readers who
would benefit from this knowledge, the error message suffices.  For everyone
else, it would just dilute the text.  (One could argue against other parts of
our documentation on this basis, but I'm not calling for such a study.  I'm
just saying that today's lack of documentation on this topic is optimal.)

> > > I think the way forward here, if any, is to work on removing these
> > > restrictions, not to keep sprinkling them around.
> > 
> > If we were talking about pathnames containing spaces, I would agree,
> > but I've never heard of a legitimate pathname containing CR or LF.  I
> > can't see us losing much by refusing to allow such pathnames, except
> > for security holes.

(In other words, +1 to that.)

> The flip side of that is that if we're doing a half-way job of only
> prohibiting these characters in 67% of cases, then a new generation of
> tools will be written on top of that with the assumption that these
> characters cannot appear.

There's some risk there, but it doesn't feel too likely to me.  If the current
tools generation had contemplated those characters, we'd have learned of
CVE-2016-5424 earlier.  Even so, ...

> The current setup is more robust:  We are prohibiting these characters
> in specific locations where we know we can't handle them.  But we don't
> give any guarantees about anything else.

... I'd be fine with that conclusion.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-11 Thread Michael Paquier

On Sun, Sep 11, 2016 at 5:21 AM, Tom Lane  wrote:
> The fact that the pg_stat_replication view does show walsender processes
> seems like possibly a reasonable argument for not showing them in
> pg_stat_activity.  But we'd have to do some rejiggering of the view
> definition to make that happen.

We may actually had better show WAL sender processes in
pg_stat_activity. An argument in favor of that is the tracking of
WaitEventSet events (or latches if you want).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CommitFest 2016-09 status summary

2016-09-11 Thread Fabrízio de Royes Mello

Hi all,

The current status summary is:

Needs review: 81
Waiting on author: 35
Ready for Commiter: 12
Commited: 78
Moved to next CF: 1
Rejected: 6
Returned with feedback: 6
TOTAL: 219

The current progress is ~39%. The things moving fast but 27 patches still
with no signed reviewer. So if you have time please consider reviewing at
least one patch. We need your help!

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Gavin Flower


On 12/09/16 12:16, Gavin Flower wrote:
[...]
 two blocks would be logically adjacent (which means they are likely 
to be physically close together, but not guaranteed!).



[...]

Actual disk layouts are quite complicated, the above is an over 
simplification, but the message is still valid.


There are various tricks of disc layout ( & low level handling) that can 
be used to minimise the time taken to read 2 blocks that are logically 
adjacent.  I had to know this stuff for discs that MainFrame computers 
used in the 1980's - modern disk systems are way more complicated, but 
the conclusions are still valid.


I am extremely glad that I no longer have to concern myself with 
understanding the precise low stuff on discs these days - there is no 
longer a one-to-one correspondence of what the O/S thinks is a disk 
block, with how the data is physically recorded on the disc.



Cheers,
Gavin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Gavin Flower


On 12/09/16 10:13, Peter Geoghegan wrote:

On Sun, Sep 11, 2016 at 8:47 AM, Heikki Linnakangas  wrote:

[...]

I don't know what the difference is between accessing 10 pages
randomly, and accessing a random set of 10 single pages sequentially,
in close succession. As Tom would say, that's above my pay grade. I
suppose it comes down to how close "close" actually is (but in any
case, it's all very fudged).


If you select ten pages at random and sort them, then consecutive reads 
of the sorted list are more likely to access pages in the same block or 
closely adjacent (is my understanding).


eg

blocks:            
pages:   0 1   2 3   4 56 78 9

if the ten 'random pages' were selected in the random order:
6 1 7 8 4 2 9 3 0
Consecutive reads would always read new blocks, but the sorted list 
would have blocks read sequentially.


In practice, it would be rarely this simple.  However, if any of the 
random pages where in the same block, then that block would only need to 
be fetched once - similarly if 2 of the random pages where in 
consecutive blocks, then the two blocks would be logically adjacent 
(which means they are likely to be physically close together, but not 
guaranteed!).


[...]


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan  wrote:
>> +   for (tapenum = 0; tapenum < maxTapes; tapenum++)
>> +   {
>> +   LogicalTapeAssignReadBufferSize(state->tapeset, tapenum,
>> +   (per_tape + (tapenum < cutoff ? 1 : 
>> 0)) * BLCKSZ);
>> +   }

Spotted another issue with this code just now. Shouldn't it be based
on state->tapeRange? You don't want the destTape to get memory, since
you don't use batch memory for tapes that are written to (and final
on-the-fly merges don't use their destTape at all).

(Looks again...)

Wait, you're using a local variable maxTapes here, which potentially
differs from state->maxTapes:

> +   /*
> +* If we had fewer runs than tapes, refund buffers for tapes that were 
> never
> +* allocated.
> +*/
> +   maxTapes = state->maxTapes;
> +   if (state->currentRun < maxTapes)
> +   {
> +   FREEMEM(state, (maxTapes - state->currentRun) * TAPE_BUFFER_OVERHEAD);
> +   maxTapes = state->currentRun;
> +   }

I find this confusing, and think it's probably buggy. I don't think
you should have a local variable called maxTapes that you modify at
all, since state->maxTapes is supposed to not change once established.
You can't use state->currentRun like that, either, I think, because
it's the high watermark number of runs (quicksorted runs), not runs
after any particular merge phase, where we end up with fewer runs as
they're merged (we must also consider dummy runs to get this) -- we
want something like activeTapes. cf. the code you removed for the
beginmerge() finalMergeBatch case. Of course, activeTapes will vary if
there are multiple merge passes, which suggests all this code really
has no business being in mergeruns() (it should instead be in
beginmerge(), or code that beginmerge() reliably calls).

Immediately afterwards, you do this:

> +   /* Initialize the merge tuple buffer arena.  */
> +   state->batchMemoryBegin = palloc((maxTapes + 1) * MERGETUPLEBUFFER_SIZE);
> +   state->batchMemoryEnd = state->batchMemoryBegin + (maxTapes + 1) * 
> MERGETUPLEBUFFER_SIZE;
> +   state->freeBufferHead = (MergeTupleBuffer *) state->batchMemoryBegin;
> +   USEMEM(state, (maxTapes + 1) * MERGETUPLEBUFFER_SIZE);

The fact that you size the buffer based on "maxTapes + 1" also
suggests a problem. I think that the code looks like this because it
must instruct logtape.c that the destTape tape requires some buffer
(iff there is to be a non-final merge). Is that right? I hope that you
don't give the typically unused destTape tape a full share of batch
memory all the time (the same applies to any other
inactive-at-final-merge tapes).

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Merge Join with an Index Optimization

2016-09-11 Thread Thomas Munro

On Sun, Sep 11, 2016 at 11:07 PM, Michael Malis  wrote:
> On Sun, Sep 11, 2016 at 9:20 AM Tom Lane  wrote:
>>
>> Michael Malis  writes:
>> >> As I understand it, a merge join will currently read all tuples from
>> >> both
>> >> subqueries (besides early termination). I believe it should be possible
>> >> to
>> >> take advantages of the indexes on one or both of the tables being read
>> >> from
>> >> to skip a large number of tuples that would currently be read.
>> >>
>> > IIUC, what you're proposing is to replace "read past some tuples in the
>> > index" with "re-descend the index btree".  Yes, that would win if it
>
>
> Roughly yes, although is it necessary to rescan the btree from the top? Is
> it not possible to "resume" the scan from the previous location?
>
>>
>> > You might want to troll the archives for past discussions of index skip
>> > scan, which is more or less the same idea (could possibly be exactly the
>> > same idea, depending on how it's implemented).
>
>
> After searching a bit, all I was able to find was this. It looks like
> someone had a rough patch of applying a similar idea to DISTINCT. I can't
> tell what happened to it, but in the patch they mention that it should be
> possible to do a "local traversal ie from the current page".

That was me.  Yeah, reserving the option for traversing other than
from the root was my reason for wanting to introduce a skip operation
to the IM interface as described in the later email in that thread[2],
rather than doing it via rescanning (though maybe it's not strictly
necessary, I don't know).  There are a whole lot of interesting
execution tricks that could be enabled by btree skipping (see Oracle,
DB2, MySQL, SQLite).  DISTINCT was the simplest thing I could think of
where literally every other RDBMS beats us because we don't have it.
But that was nowhere near a useful patch, just an experiment.  I hope
I can get back to looking at it for 10...

[2] 
https://www.postgresql.org/message-id/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Vitaly Burovoy

On 9/11/16, Kevin Grittner  wrote:
> On Sat, Sep 10, 2016 at 12:26 AM, Vitaly Burovoy
>> Mark it as "Ready for committer".
>>
>> P.S.: While I was reviewing I simplified SQL query: improved version
>> only 2 seqscans instead of 3 seqscans with an inner loop in an
>> original one.
>> Please find a file "tab-complete-create-database-improved.patch"
>> attached.
>
> I was able to find cases during test which were not handled
> correctly with either version, so I tweaked the query a little.

Hmm. Which one? Attempt to "SET ROLE "?
Unfortunately, I after reading your letter I realized that I missed a
case (it is not working even with your version):

postgres=# CREATE GROUP g1;
CREATE ROLE
postgres=# CREATE GROUP g2;
CREATE ROLE
postgres=# GRANT g2 TO g1;
GRANT ROLE
postgres=# CREATE USER u1 CREATEDB;
CREATE ROLE
postgres=# GRANT g1 TO u1;
GRANT ROLE
postgres=# CREATE DATABASE db_tpl;
CREATE DATABASE
postgres=# ALTER DATABASE db_tpl OWNER TO g2;
ALTER DATABASE
postgres=# SET ROLE g1;
SET
postgres=> CREATE DATABASE db1 TEMPLATE db   -- shows nothing

postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing
postgres=> RESET ROLE;
RESET
postgres=# SET ROLE u1;
SET
postgres=> CREATE DATABASE db1 TEMPLATE db   -- shows nothing

postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing
postgres=> CREATE DATABASE db1 TEMPLATE db_tpl;
CREATE DATABASE


It seems if a user has the CREATEDB privilege and he is a member of a
group (role) which owns a database, he can use the database as a
template. But to support it recursive queries should be used. Is it
useless or should be fixed?

> Also, for future reference, please try to use white-space that
> matches surrounding code -- it make scanning through code less
> "jarring".

I tried to. Should "FROM" be at the same row as sub-"SELECT" is
placed? Or there should be something else (I'm now only about
white-space formatting)?
Surrounding code looks similar enough for me... =(

> Thanks all for the patch and the reviews!

Thank you!

> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan  wrote:
> * Doesn't this code need to call MemoryContextAllocHuge() rather than 
> palloc()?:
>
>> @@ -709,18 +765,19 @@ LogicalTapeRewind(LogicalTapeSet *lts, int tapenum, 
>> bool forWrite)
>> Assert(lt->frozen);
>> datablocknum = ltsRewindFrozenIndirectBlock(lts, lt->indirect);
>> }
>> +
>> +   /* Allocate a read buffer */
>> +   if (lt->buffer)
>> +   pfree(lt->buffer);
>> +   lt->buffer = palloc(lt->read_buffer_size);
>> +   lt->buffer_size = lt->read_buffer_size;

Of course, when you do that you're going to have to make the new
"buffer_size" and "read_buffer_size" fields of type "Size" (or,
possibly, "int64", to match tuplesort.c's own buffer sizing variables
ever since Noah added MaxAllocSizeHuge). Ditto for the existing "pos"
and "nbytes" fields next to "buffer_size" within the struct
LogicalTape, I think. ISTM that logtape.c blocknums can remain of type
"long", though, since that reflects an existing hardly-relevant
limitation that you're not making any worse.

More generally, I think you also need to explain in comments why there
is a "read_buffer_size" field in addition to the "buffer_size" field.
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in filename identification

2016-09-11 Thread Daniel Gustafsson

The IDENTIFICATION filename in src/backend/storage/ipc/dsm_impl.c is
incorrectly labelling the file dsm.c.  Patch fixing the typo attached.

cheers ./daniel



typo-dsm_impl_identification.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan  wrote:
> * Please make this use the ".., + 1023" natural rounding trick that is
> used in the similar traces that are removed:
>
>> +#ifdef TRACE_SORT
>> +   if (trace_sort)
>> +   elog(LOG, "using %d kB of memory for read buffers in %d tapes, %d kB 
>> per tape",
>> +(int) (state->availMem / 1024), maxTapes, (int) (per_tape * 
>> BLCKSZ) / 1024);
>> +#endif

Also, please remove the int cast, and use INT64_FORMAT. Again, this
should match existing trace_sort traces concerning batch memory.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-11 Thread Michael Paquier

On Mon, Sep 12, 2016 at 1:47 AM, Tom Lane  wrote:
> Looks reasonable to me, we'll soon see what the buildfarm thinks.

Thanks for the commit. I am seeing green statuses on the buildfarm.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 8:47 AM, Heikki Linnakangas  wrote:
> Here's a new version of these patches, rebased over current master. I
> squashed the two patches into one, there's not much point to keep them
> separate.

I think I have my head fully around this now. For some reason, I
initially thought that this patch was a great deal more radical than
it actually is. (Like Greg, I somehow initially thought that you were
rejecting the idea of batch memory in general, and somehow (over)
leveraging the filesystem cache. I think I misunderstood your remarks
when we talked on IM about it early on.)

I don't know what the difference is between accessing 10 pages
randomly, and accessing a random set of 10 single pages sequentially,
in close succession. As Tom would say, that's above my pay grade. I
suppose it comes down to how close "close" actually is (but in any
case, it's all very fudged).

I mention this because I think that cost_sort() should be updated to
consider sequential I/O the norm, alongside this patch of yours (your
patch strengthens the argument [1] for that general idea). The reason
that this new approach produces more sequential I/O, apart from the
simple fact that you effectively have much more memory available and
so fewer rounds of preloading, is that the indirection block stuff can
make I/O less sequential in order to support eager reclamation of
space. For example, maybe there is interleaving of blocks as logtape.c
manages to reclaim blocks in the event of multiple merge steps. I care
about that second factor a lot more now than I would have a year ago,
when a final on-the-fly merge generally avoids multiple passes (and
associated logtape.c block fragmentation), because parallel CREATE
INDEX is usually affected by that (workers will often want to do their
own merge ahead of the leader's final merge), and because I want to
cap the number of tapes used, which will make multiple passes a bit
more common in practice.

I was always suspicious of the fact that memtuples is so large during
merging, and had a vague plan to fix that (although I was the one
responsible for growing memtuples even more for the merge in 9.6, that
was just because under the status quo of having many memtuples during
the merge, the size of memtuples should at least be in balance with
remaining memory for caller tuples -- it wasn't an endorsement of the
status quo). However, it never occurred to me to do that by pushing
batch memory into the head of logtape.c, which now seems like an
excellent idea.

To summarize my understanding of this patch: If it wasn't for my work
on parallel CREATE INDEX, I would consider this patch to give us only
a moderate improvement to user-visible performance, since it doesn't
really help memory rich cases very much (cases that are not likely to
have much random I/O anyway). In that universe, I'd be more
appreciative of the patch as a simplifying exercise, since while
refactoring. It's nice to get a boost for more memory constrained
cases, it's not a huge win. However, that's not the universe we live
in -- I *am* working on parallel CREATE INDEX, of course. And so, I
really am very excited about this patch, because it really helps with
the particular challenges I have there, even though it's fair to
assume that we have a reasonable amount of memory available when
parallelism is used. If workers are going to do their own merges, as
they often will, then multiple merge pass cases become far more
important, and any additional I/O is a real concern, *especially*
additional random I/O (say from logtape.c fragmentation). The patch
directly addresses that, which is great. Your patch, alongside my
just-committed patch concerning how we maintain the heap invariant,
together attack the merge bottleneck from two different directions:
they address I/O costs, and CPU costs, respectively.

Other things I noticed:

* You should probably point out that typically, access to batch memory
will still be sequential, despite your block-based scheme. The
preloading will now more or less make that the normal case. Any
fragmentation will now be essentially in memory, not on disk, which is
a big win.

* I think that logtape.c header comments are needed for this. Maybe
that's where you should point out that memory access is largely
sequential. But it's surely true that logtape.c needs to take
responsibility for being the place where the vast majority of memory
is allocated during merging.

* i think you should move "bool   *mergeactive; /* active input run
source? */" within Tuplesortstate to be next to the other batch memory
stuff. No point in having separate merge and batch "sections" there
anymore.

* You didn't carry over my 0002-* batch memory patch modifications to
comments, even though you should have in a few cases. There remains
some references in comments to batch memory, as a thing exclusively
usable by final on-the-fly merges. That's not true anymore -- it's
usable by final merges, too. For

Re: [HACKERS] Merge Join with an Index Optimization

2016-09-11 Thread Michael Malis

On Sun, Sep 11, 2016 at 9:20 AM Tom Lane  wrote:

> Michael Malis  writes:
> >> As I understand it, a merge join will currently read all tuples from
> both
> >> subqueries (besides early termination). I believe it should be possible
> to
> >> take advantages of the indexes on one or both of the tables being read
> from
> >> to skip a large number of tuples that would currently be read.
> >>
> > IIUC, what you're proposing is to replace "read past some tuples in the
> > index" with "re-descend the index btree".  Yes, that would win if it
>

Roughly yes, although is it necessary to rescan the btree from the top? Is
it not possible to "resume" the scan from the previous location?

> > You might want to troll the archives for past discussions of index skip
> > scan, which is more or less the same idea (could possibly be exactly the
> > same idea, depending on how it's implemented).
>

After searching a bit, all I was able to find was this
.
It looks like someone had a rough patch of applying a similar idea to
DISTINCT. I can't tell what happened to it, but in the patch they mention
that it should be possible to do a "local traversal ie from the current
page".

A different, but similar optimization, would be to first check the join
condition with the data in the index. Then only fetch the full row only if
the data in the index passes the join condition. I imagine that this would
be easier to implement, although less efficent than what I proposed above
because it has to scan the entire index.

Thanks,
Michael

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Kevin Grittner

On Sat, Sep 10, 2016 at 12:26 AM, Vitaly Burovoy
 wrote:

> Tested manually in versions 9.2 and 10devel, I hope it can be
> back-patched to all supported versions.

We don't normally back-patch tab completion work unless it is
fixing a crash or removing displayed entries which make no sense.
This is in line with the overall project policy of only
back-patching bug fixes, not enhancements.  At this point I have
only applied it to the master branch, and won't go back farther
without a clear consensus to do so.  It's not that this particular
change is all that scary, but it's a step down a slippery slope
that should not be taken without everyone being on the same page
about why this should be an exception and the next thing should
not.

> Mark it as "Ready for committer".
>
> P.S.: While I was reviewing I simplified SQL query: improved version
> only 2 seqscans instead of 3 seqscans with an inner loop in an
> original one.
> Please find a file "tab-complete-create-database-improved.patch" attached.

I was able to find cases during test which were not handled
correctly with either version, so I tweaked the query a little.
Also, for future reference, please try to use white-space that
matches surrounding code -- it make scanning through code less
"jarring".

Thanks all for the patch and the reviews!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-09-11 Thread Kuntal Ghosh

Hello,

Based on the previous discussions, I've modified the existing patch.

>+   void(*rm_checkConsistency) (XLogReaderState *record);
>All your _checkConsistency functions share the same pattern, in short
>they all use a for loop for each block, call each time
>XLogReadBufferExtended, etc. And this leads to a *lot* of duplication.
>You would get a reduction by a couple of hundreds of lines by having a
>smarter refactoring. And to be honest, if I look at your patch what I
>think is the correct way of doing things is to add to the rmgr not
>this check consistency function, but just a pointer to the masking
>function.
+1. In rmgrlist, I've added a pointer to the masking function for each rmid.
A common function named checkConsistency calls these masking functions
based on their rmid and does comparison for each block.

>> - If WAL consistency check is enabled for a rmgrID, we always include
>> the backup image in the WAL record.
>
>What happens if wal_consistency has different settings on a standby
>and its master? If for example it is set to 'all' on the standby, and
>'none' on the master, or vice-versa, how do things react? An update of
>this parameter should be WAL-logged, no?
If wal_consistency is enabled for a rmid, standby will always check whether
backup image exists or not i.e. BKPBLOCK_HAS_IMAGE is set or not.
(I guess Amit and Robert also suggested the same in the thread)
Basically, BKPBLOCK_HAS_IMAGE is set if a block contains image and
BKPIMAGE_IS_REQUIRED_FOR_REDO (I've added this one) is set if that backup
image is required during redo. When we decode a wal record, has_image
flag of DecodedBkpBlock is set to BKPIMAGE_IS_REQUIRED_FOR_REDO.

>+   if (pg_strcasecmp(tok, "Heap2") == 0)
>+   {
>+   newwalconsistency[RM_HEAP2_ID] = true;
>+   }
>Thinking more about it, I guess that we had better change the
>definition list of rmgrs in rmgr.h and get something closer to
>RmgrDescData that pg_xlogdump has to avoid all this stanza by
>completing it with the name of the rmgr. The only special cases that
>this code path would need to take care of would be then 'none' and
>'all'. You could do this refactoring on top of the main patch to
>simplify it as it is rather big (1.7k lines).
I've modified it exactly like pg_xlogdump does. Additionally, it checks
whether masking function is defined for the rmid or not. Hence, in future,
if we want to include any other rmid for wal consistency check, we just need
to define its masking function.

>> - In recovery tests (src/test/recovery/t), I've added wal_consistency
>> parameter in the existing scripts. This feature doesn't change the
>> expected output. If there is any inconsistency, it can be verified in
>> corresponding log file.
>
>I am afraid that just generating a WARNING message is going to be
>useless for the buildfarm. If we want to detect errors, we could for
>example have an additional GUC to trigger an ERROR or a FATAL, taking
>down the cluster, and allowing things to show in red on a platform.
For now, I've kept this as a WARNING message to detect all inconsistencies
at once. Once, the patch is finalized, I'll modify it as an ERROR message.

Thoughts?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 9bb1362..3ca64d1 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -26,12 +26,13 @@
 #include "commands/tablespace.h"
 #include "replication/message.h"
 #include "replication/origin.h"
+#include "storage/bufmask.h"
 #include "storage/standby.h"
 #include "utils/relmapper.h"
 
 /* must be kept in sync with RmgrData definition in xlog_internal.h */
-#define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup) \
-	{ name, redo, desc, identify, startup, cleanup },
+#define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,maskPage) \
+	{ name, redo, desc, identify, startup, cleanup, maskPage },
 
 const RmgrData RmgrTable[RM_MAX_ID + 1] = {
 #include "access/rmgrlist.h"
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2189c22..77e79f5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -25,6 +25,7 @@
 #include "access/commit_ts.h"
 #include "access/multixact.h"
 #include "access/rewriteheap.h"
+#include "access/rmgr.h"
 #include "access/subtrans.h"
 #include "access/timeline.h"
 #include "access/transam.h"
@@ -95,6 +96,8 @@ bool		EnableHotStandby = false;
 bool		fullPageWrites = true;
 bool		wal_log_hints = false;
 bool		wal_compression = false;
+char		*wal_consistency_string = NULL;
+bool		*wal_consistency = NULL;
 bool		log_checkpoints = false;
 int			sync_method = DEFAULT_SYNC_METHOD;
 int			wal_level = WAL_LEVEL_MINIMAL;
@@ -870,6 +873,7 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Tom Lane

Kevin Grittner  writes:
> test=# create role fred with createdb;
> CREATE ROLE
> test=# create user bob;
> CREATE ROLE
> test=# grant fred to bob;
> GRANT ROLE
> test=# alter database postgres owner to fred;
> ALTER DATABASE
> test=# set role fred;
> SET
> test=> create database db1 template postgres;
> CREATE DATABASE
> test=> reset role;
> RESET
> test=# set role bob;
> SET
> test=> create database db2 template postgres;
> ERROR:  permission denied to create database

> Opinions on whether this is a bug or correct behavior?

It's operating as designed, anyway.  Role properties such as CREATEDB
are not grantable privileges and thus can't be inherited via GRANT.
There's been some muttering about changing that; but most people don't
seem to think that letting superuserness in particular be inherited
would be a good thing, so it hasn't gone anywhere.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Install extensions using update scripts (was Re: [HACKERS] Remove superuser() checks from pgstattuple)

2016-09-11 Thread Tom Lane

Pushed with adjustments for the review points.  Hopefully this will make
Stephen's life easier, along with others submitting contrib-module
updates.  We should urge anyone who submits an old-style extension update
patch to consider whether they really want to bother with a new base
script.

At some point it might be nice to have an articulated project policy
about when a new base script is or isn't appropriate, but I doubt we
have enough experience to lay one down now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 6:28 AM, Heikki Linnakangas  wrote:
> Pushed this "displace root" patch, with some changes:

Attached is rebased version of the entire patch series, which should
be applied on top of what you pushed to the master branch today.

This features a new scheme for managing workMem --
maintenance_work_mem is now treated as a high watermark/budget for the
entire CREATE INDEX operation, regardless of the number of workers.
This seems to work much better, so Robert was right to suggest it.

There were also improvements to the cost model, to weigh available
maintenance_work_mem under this new system. And, the cost model was
moved inside planner.c (next to plan_cluster_use_sort()), which is
really where it belongs. The cost model is still WIP, though, and I
didn't address some concerns of my own about how tuplesort.c
coordinates workers. I think that Robert's "condition variables" will
end up superseding that stuff anyway. And, I think that this v2 will
bitrot fairly soon, when Heikki commits what is in effect his version
of my 0002-* patch (that's unchanged, if only because it refactors
some things that the parallel CREATE INDEX patch is reliant on).

So, while there are still a few loose ends with this revision (it
should still certainly be considered WIP), I wanted to get a revision
out quickly because V1 has been left to bitrot for too long now, and
my schedule is very full for the next week, ahead of my leaving to go
on vacation (which is long overdue). Hopefully, I'll be able to get
out a third revision next Saturday, on top of the
by-then-presumably-committed new tape batch memory patch from Heikki,
just before I leave. I'd rather leave with a patch available that can
be cleanly applied, to make review as easy as possible, since it
wouldn't be great to have this V2 with bitrot for 10 days or more.

-- 
Peter Geoghegan


0003-Rearrange-header-file-include-directives.patch.gz
Description: GNU Zip compressed data


0005-Add-force_btree_randomaccess-GUC-for-testing.patch.gz
Description: GNU Zip compressed data


0001-Cap-the-number-of-tapes-used-by-external-sorts.patch.gz
Description: GNU Zip compressed data


0002-Use-tuplesort-batch-memory-for-randomAccess-sorts.patch.gz
Description: GNU Zip compressed data


0004-Add-parallel-B-tree-index-build-sorting.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Kevin Grittner

On Sat, Sep 10, 2016 at 12:26 AM, Vitaly Burovoy
 wrote:

> According to the documentation since 9.2 till devel a database can be
> used as a template if it has a "datistemplate" mark or by superusers
> or by their owners.

Hm.  I wonder whether the following failure of "bob" to use the
specified template is intended or a bug.  It seems to make sense to
sort that out before looking at the related tab completion.

test=# checkpoint;
CHECKPOINT
test=# create role fred with createdb;
CREATE ROLE
test=# create user bob;
CREATE ROLE
test=# grant fred to bob;
GRANT ROLE
test=# alter database postgres owner to fred;
ALTER DATABASE
test=# set role fred;
SET
test=> create database db1 template postgres;
CREATE DATABASE
test=> reset role;
RESET
test=# set role bob;
SET
test=> create database db2 template postgres;
ERROR:  permission denied to create database

Opinions on whether this is a bug or correct behavior?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-11 Thread Tom Lane

Michael Paquier  writes:
> On Sun, Sep 11, 2016 at 1:03 AM, Tom Lane  wrote:
>> It might accidentally fail to fail as-is, but likely it would be better
>> not to be pushing garbage paths into extraincludes and extralibs when
>> some of those options aren't set.

> Right, we need to correct something here. But this block does not need
> any adjustment: it can just be deleted. A contrib module is added via
> AddProject() in Solution.pm, and if the build is done with xml, xslt
> or iconv their libraries and includes are added in any case, for any
> declared project. So declaring a dependency with xml or xslt is just
> moot work, and actually this would be broken without AddProject
> because it is missing to list iconv.lib... At the same time, I think
> that we could fix the inclusions with uuid for uuid-ossp, and just put
> them as well in AddProject with the rest. Please see the updated
> cleanup patch attached.

Looks reasonable to me, we'll soon see what the buildfarm thinks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-11 Thread Christian Convey

Some of your patches look useful, but unrelated to C++: 7, 8, 15, 16(?), 20.

I applied that subset to 9.6 and got a clean "make check".

Would it make sense to add them to the next commitfest, regardless of
the C++ effort?

On Wed, Aug 31, 2016 at 9:41 AM, Peter Eisentraut
 wrote:
> [trimmed cc list because of big attachments]
>
> On 8/16/16 4:22 PM, Jim Nasby wrote:
>> Joy, do you have an idea what a *minimally invasive* patch for C++
>> support would look like? That's certainly the first step here.
>
> I developed a minimally invasive patch for C++ support a few years ago
> shortly after I wrote that blog post.  Since there appears to have been
> some interest here now, I have updated that and split it up into logical
> chunks.
>
> So here you go.
>
> To build this, you need to configure with g++ <= version 5.  (4.x works,
> too.)  g++ version 6 does not work yet because of the issues described
> in patch 0013.
>
> Then you also need to edit src/Makefile.custom and set
>
> COPT = -fpermissive -Wno-sign-compare -Wno-write-strings
>
> The -W options are optional just to reduce some noise.  Cleaning up
> those warnings can be a separate project that might also have some
> benefit under C.
>
> The -fpermissive option is a g++ specific option that reduces some
> errors to warnings.  (So this won't work with clang or other compilers
> at all at this point.)  In particular, C++ does not allow casting from
> or to void pointers without a cast, but -fpermissive allows that.  The
> step from this to "real" C++ would be adding a bunch of casts around
> things like malloc and palloc and other places.  That would be mostly
> busy work, so I have excluded that here.
>
> The patches are numbered approximately in increasing order of dubiosity.
>  So 0001 is probably a straight bug fix, 0002 and 0003 are arguably
> minor bug fixes as well.  The patches through 0012 can probably be
> considered for committing in some form.  After that it gets a bit hackish.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Merge Join with an Index Optimization

2016-09-11 Thread Tom Lane

Michael Malis  writes:
> As I understand it, a merge join will currently read all tuples from both
> subqueries (besides early termination). I believe it should be possible to
> take advantages of the indexes on one or both of the tables being read from
> to skip a large number of tuples that would currently be read.

IIUC, what you're proposing is to replace "read past some tuples in the
index" with "re-descend the index btree".  Yes, that would win if it
allows skipping over a large number of index entries, but it could lose
big-time if not.  The difficulty is that I don't see how the merge join
level could predict whether it would win for any particular advance step.
You'd really need most of the smarts to be in the index AM.

You might want to troll the archives for past discussions of index skip
scan, which is more or less the same idea (could possibly be exactly the
same idea, depending on how it's implemented).  I think we had arrived
at the conclusion that re-descent from the root might be appropriate
when transitioning to another index page, but not intra-page.  Anyway
no one's produced a concrete patch yet.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cost_sort() may need to be updated

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 9:01 AM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> I think that we *can* refine this guess, and should, because random
>> I/O is really quite unlikely to be a large cost these days (I/O in
>> general often isn't a large cost, actually). More fundamentally, I
>> think it's a problem that cost_sort() thinks that external sorts are
>> far more expensive than internal sorts in general. There is good
>> reason to think that that does not reflect the reality. I think we can
>> expect external sorts to be *faster* than internal sorts with
>> increasing regularity in Postgres 10.
>
> TBH, if that's true, haven't you broken something?

It's possible for external sorts to be faster some of the time because
the memory access patterns can be more cache efficient: smaller runs
are better when accessing tuples in sorted order, scattered across
memory. More importantly, the sort can start returning tuples earlier
in the common case where a final on-the-fly merge can be performed. In
principle, you could adopt internal sorts to have the same advantages,
but that hasn't and probably won't happen. Finally, the external sort
I/O costs grow linearly, whereas the CPU costs grow in a linearithmic
fashion, which will eventually come to dominate. We can hide the
latency of those costs pretty well, too, with asynchronous I/O.

I'm not arguing that cost_sort() should think that external sorts are
cheaper under any circumstances, since all of this is very hard to
model. I only mention this because it illustrates nicely that
cost_sort() has the wrong idea.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cost_sort() may need to be updated

2016-09-11 Thread Tom Lane

Peter Geoghegan  writes:
> I think that we *can* refine this guess, and should, because random
> I/O is really quite unlikely to be a large cost these days (I/O in
> general often isn't a large cost, actually). More fundamentally, I
> think it's a problem that cost_sort() thinks that external sorts are
> far more expensive than internal sorts in general. There is good
> reason to think that that does not reflect the reality. I think we can
> expect external sorts to be *faster* than internal sorts with
> increasing regularity in Postgres 10.

TBH, if that's true, haven't you broken something?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-11 Thread Peter Geoghegan

On Sun, Sep 11, 2016 at 6:28 AM, Heikki Linnakangas  wrote:
> * I renamed "tuplesort_heap_siftup()" to "tuplesort_delete_top()". I realize
> that this is controversial, per the discussion on the "Is
> tuplesort_heap_siftup() a misnomer?" thread. However, now that we have a new
> function, "tuplesort_heap_replace_top()", which is exactly the same
> algorithm as the "delete_top()" algorithm, calling one of them "siftup"
> became just too confusing.

I feel pretty strongly that this was the correct decision. I would
have gone further, and removed any mention of "Sift up", but you can't
win them all.

> * Instead of "root_displace", I used the name "replace_top", and
> "delete_top" for the old siftup function. Because we use "top" to refer to
> memtuples[0] more commonly than "root", in the existing comments.

Fine by me.

> * I shared the code between the delete-top and replace-top. Delete-top now
> calls the replace-top function, with the last element of the heap. Both
> functions have the same signature, i.e. they both take the checkIndex
> argument. Peter's patch left that out for the "replace" function, on
> performance grounds, but if that's worthwhile, that seems like a separate
> optimization. Might be worth benchmarking that separately, but I didn't want
> to conflate that with this patch.

Okay.

> * I replaced a few more siftup+insert calls with the new combined
> replace-top operation. Because why not.

I suppose that the consistency has value, from a code clarity standpoint.

> Thanks for the patch, Peter, and thanks for the review, Claudio!

Thanks Heikki!

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-11 Thread Heikki Linnakangas

Here's a new version of these patches, rebased over current master. I 
squashed the two patches into one, there's not much point to keep them 
separate.


- Heikki

>From 6e3813d876cf3efbe5f1b80c45f44ed5494304ab Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Sun, 11 Sep 2016 18:41:44 +0300
Subject: [PATCH 1/1] Change the way pre-reading in external sort's merge phase
 works.

Don't pre-read tuples into SortTuple slots during merge. Instead, use the
memory for larger read buffers in logtape.c. We're doing the same number
of READTUP() calls either way, but managing the pre-read SortTuple slots
is much more complicated. Also, the on-tape representation is more compact
than SortTuples, so we can fit more pre-read tuples into the same amount
of memory this way. And we have better cache-locality, when we use just a
small number of SortTuple slots.

Now that we only hold one tuple from each tape in the SortTuple slots, we
can greatly simplify the "batch memory" management. We now maintain a
small set of fixed-sized buffers, to hold the tuples, and fall back to
palloc() for larger tuples. We use this method during all merge phases,
not just the final merge.
---
 src/backend/utils/sort/logtape.c   | 134 -
 src/backend/utils/sort/tuplesort.c | 984 +++--
 src/include/utils/logtape.h|   1 +
 3 files changed, 389 insertions(+), 730 deletions(-)

diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index 7745207..05d7697 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -131,9 +131,12 @@ typedef struct LogicalTape
 	 * reading.
 	 */
 	char	   *buffer;			/* physical buffer (separately palloc'd) */
+	int			buffer_size;	/* allocated size of the buffer */
 	long		curBlockNumber; /* this block's logical blk# within tape */
 	int			pos;			/* next read/write position in buffer */
 	int			nbytes;			/* total # of valid bytes in buffer */
+
+	int			read_buffer_size;	/* buffer size to use when reading */
 } LogicalTape;
 
 /*
@@ -228,6 +231,53 @@ ltsReadBlock(LogicalTapeSet *lts, long blocknum, void *buffer)
 }
 
 /*
+ * Read as many blocks as we can into the per-tape buffer.
+ *
+ * The caller can specify the next physical block number to read, in
+ * datablocknum, or -1 to fetch the next block number from the internal block.
+ * If datablocknum == -1, the caller must've already set curBlockNumber.
+ *
+ * Returns true if anything was read, 'false' on EOF.
+ */
+static bool
+ltsReadFillBuffer(LogicalTapeSet *lts, LogicalTape *lt, long datablocknum)
+{
+	lt->pos = 0;
+	lt->nbytes = 0;
+
+	do
+	{
+		/* Fetch next block number (unless provided by caller) */
+		if (datablocknum == -1)
+		{
+			datablocknum = ltsRecallNextBlockNum(lts, lt->indirect, lt->frozen);
+			if (datablocknum == -1L)
+break;			/* EOF */
+			lt->curBlockNumber++;
+		}
+
+		/* Read the block */
+		ltsReadBlock(lts, datablocknum, (void *) (lt->buffer + lt->nbytes));
+		if (!lt->frozen)
+			ltsReleaseBlock(lts, datablocknum);
+
+		if (lt->curBlockNumber < lt->numFullBlocks)
+			lt->nbytes += BLCKSZ;
+		else
+		{
+			/* EOF */
+			lt->nbytes += lt->lastBlockBytes;
+			break;
+		}
+
+		/* Advance to next block, if we have buffer space left */
+		datablocknum = -1;
+	} while (lt->nbytes < lt->buffer_size);
+
+	return (lt->nbytes > 0);
+}
+
+/*
  * qsort comparator for sorting freeBlocks[] into decreasing order.
  */
 static int
@@ -546,6 +596,8 @@ LogicalTapeSetCreate(int ntapes)
 		lt->numFullBlocks = 0L;
 		lt->lastBlockBytes = 0;
 		lt->buffer = NULL;
+		lt->buffer_size = 0;
+		lt->read_buffer_size = BLCKSZ;
 		lt->curBlockNumber = 0L;
 		lt->pos = 0;
 		lt->nbytes = 0;
@@ -628,7 +680,10 @@ LogicalTapeWrite(LogicalTapeSet *lts, int tapenum,
 
 	/* Allocate data buffer and first indirect block on first write */
 	if (lt->buffer == NULL)
+	{
 		lt->buffer = (char *) palloc(BLCKSZ);
+		lt->buffer_size = BLCKSZ;
+	}
 	if (lt->indirect == NULL)
 	{
 		lt->indirect = (IndirectBlock *) palloc(sizeof(IndirectBlock));
@@ -636,6 +691,7 @@ LogicalTapeWrite(LogicalTapeSet *lts, int tapenum,
 		lt->indirect->nextup = NULL;
 	}
 
+	Assert(lt->buffer_size == BLCKSZ);
 	while (size > 0)
 	{
 		if (lt->pos >= BLCKSZ)
@@ -709,18 +765,19 @@ LogicalTapeRewind(LogicalTapeSet *lts, int tapenum, bool forWrite)
 			Assert(lt->frozen);
 			datablocknum = ltsRewindFrozenIndirectBlock(lts, lt->indirect);
 		}
+
+		/* Allocate a read buffer */
+		if (lt->buffer)
+			pfree(lt->buffer);
+		lt->buffer = palloc(lt->read_buffer_size);
+		lt->buffer_size = lt->read_buffer_size;
+
 		/* Read the first block, or reset if tape is empty */
 		lt->curBlockNumber = 0L;
 		lt->pos = 0;
 		lt->nbytes = 0;
 		if (datablocknum != -1L)
-		{
-			ltsReadBlock(lts, datablocknum, (void *) lt->buffer);
-			if (!lt->frozen)
-ltsReleaseBlock(lts, datablocknum);
-			lt->nbytes = (lt->curBlockNumber < lt->numFullBlocks) ?
-BLCKSZ : lt->lastBlockBytes;
-		}
+

Re: [HACKERS] Use nanosleep() for pg_usleep() on Unix/Linux?

2016-09-11 Thread Tom Lane

Paul Guo  writes:
> I happened to read the pg_usleep() code recently. I'm wondering
> if we could implement it using the posix function nanosleep(),
> instead of by select().

That actually looks like a pretty good idea.  Some research says that
nanosleep() is defined in SUSv2, our usual baseline for Unix features;
and it appears to be implemented and work per spec on even my oldest
buildfarm critters.  So portability looks like a non problem.

It would be nice to have two versions of pg_usleep, one where handling
of a signal was guaranteed to terminate the wait and one where it was
guaranteed not to, rather than the current no-promises situation.
It looks like we could have that on the Unix side using nanosleep(),
but it's not clear to me whether we can do the second case on Windows.

> nanosleep() is designed with higher time resolution, besides it provide
> remaining time if is interrupted by signal so that pg_usleep() could
> be implemented more accurately.
> or combine with clock_gettime() to control the sleep time more accurately.

I do not think we really care about sub-microsecond sleep resolution.
But it would be good if we could implement a sleep that would be
approximately the requested length even with signals received meanwhile.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Alter or rename enum value

2016-09-11 Thread Tom Lane

Jim Nasby  writes:
> On 9/8/16 4:55 AM, Emre Hasegeli wrote:
>> The main problem that has been discussed before was the indexes.  One
>> way is to tackle with it is to reindex all the tables after the
>> operation.  Currently we are doing it when the datatype of indexed
>> columns change.  So it should be possible, but very expensive.

> Why not just disallow dropping a value that's still in use? That's 
> certainly what I would prefer to happen by default...

Even ignoring the hidden-values-in-indexes problem, how would you
discover that it's still in use?  Not to mention preventing new
insertions after you look?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-11 Thread Christian Convey

> P.S. I'm asking because I was planning to review that patch.  But I
>> can't tell if any more review by a non-committer is still required by
>> the commitfest workflow.
>
>
> I think this has gotten enough attention, for the commitfest workflow. The
> workflow is flexible, depending on the nature of patch. But of course, if
> you're interested, feel free to review and comment anyway!

Hi Heikki,

Thanks for the help.  If you think nothing more is needed for this
patch set w.r.t. the 2016-09 commitfest, I'll move on to other things.

- Christian


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-11 Thread Heikki Linnakangas


On 09/11/2016 01:20 AM, Christian Convey wrote:

Hi Heikki,

Could I ask you a newbie-reviewer question about something I'm seeing
here?  https://commitfest.postgresql.org/10/776/

From some reading I've done (e.g., Stephen Frost's PGCon 2011 slides),
I got the impression that a successful patch would always have this
sequence of states in commitfest:
  1. patch-record created
  ...
  2. Needs Review
  ...
  3. Ready for Committer

But if I'm reading the patch's activity log correctly, it looks like
you marked the patch as "Ready for Committer" (2016-09-06 18:59:02)
without any record of it having been reviewed.

Was that intentional?


Yeah, I commented on the patches at 
https://www.postgresql.org/message-id/e8e7e5a7-0308-2c36-d32a-7aab16ba498c%40iki.fi. 
It was very cursory, but I figured that would be sufficient feedback for 
now, for Peter to proceed with the first few straightforward patches in 
the series. I don't think there's consensus that we want to do more than 
that, to actually switch to C++.



P.S. I'm asking because I was planning to review that patch.  But I
can't tell if any more review by a non-committer is still required by
the commitfest workflow.


I think this has gotten enough attention, for the commitfest workflow. 
The workflow is flexible, depending on the nature of patch. But of 
course, if you're interested, feel free to review and comment anyway!


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-11 Thread Heikki Linnakangas


On 09/11/2016 01:20 AM, Christian Convey wrote:

Hi Heikki,

Could I ask you a newbie-reviewer question about something I'm seeing
here?  https://commitfest.postgresql.org/10/776/

From some reading I've done (e.g., Stephen Frost's PGCon 2011 slides),
I got the impression that a successful patch would always have this
sequence of states in commitfest:
  1. patch-record created
  ...
  2. Needs Review
  ...
  3. Ready for Committer

But if I'm reading the patch's activity log correctly, it looks like
you marked the patch as "Ready for Committer" (2016-09-06 18:59:02)
without any record of it having been reviewed.

Was that intentional?


Yeah, I commented on the patches at 
https://www.postgresql.org/message-id/e8e7e5a7-0308-2c36-d32a-7aab16ba498c%40iki.fi. 
It was very cursory, but I figured that would be sufficient feedback for 
now, for Peter to proceed with the first few straightforward patches in 
the series. I don't think there's consensus that we want to do more than 
that, to actually switch to C++.



P.S. I'm asking because I was planning to review that patch.  But I
can't tell if any more review by a non-committer is still required by
the commitfest workflow.


I think this has gotten enough attention, for the commitfest workflow. 
But of course, if you're interested, feel free to review and comment anyway!


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-11 Thread Heikki Linnakangas


On 09/10/2016 03:22 AM, Claudio Freire wrote:

Overall, however, I believe the patch is in good shape. Only minor
form issues need to be changed, the functionality seems both desirable
and ready.


Pushed this "displace root" patch, with some changes:

* I renamed "tuplesort_heap_siftup()" to "tuplesort_delete_top()". I 
realize that this is controversial, per the discussion on the "Is 
tuplesort_heap_siftup() a misnomer?" thread. However, now that we have a 
new function, "tuplesort_heap_replace_top()", which is exactly the same 
algorithm as the "delete_top()" algorithm, calling one of them "siftup" 
became just too confusing. If anything, the new "replace_top" 
corresponds more closely to Knuth's siftup algorithm; delete-top is a 
special case of it. I added a comment on that to replace_top. I hope 
everyone can live with this.


* Instead of "root_displace", I used the name "replace_top", and 
"delete_top" for the old siftup function. Because we use "top" to refer 
to memtuples[0] more commonly than "root", in the existing comments.


* I shared the code between the delete-top and replace-top. Delete-top 
now calls the replace-top function, with the last element of the heap. 
Both functions have the same signature, i.e. they both take the 
checkIndex argument. Peter's patch left that out for the "replace" 
function, on performance grounds, but if that's worthwhile, that seems 
like a separate optimization. Might be worth benchmarking that 
separately, but I didn't want to conflate that with this patch.


* I replaced a few more siftup+insert calls with the new combined 
replace-top operation. Because why not.


Thanks for the patch, Peter, and thanks for the review, Claudio!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Useless dependency assumption libxml2 -> libxslt in MSVC scripts

2016-09-11 Thread Michael Paquier

On Sun, Sep 11, 2016 at 1:03 AM, Tom Lane  wrote:
> It might accidentally fail to fail as-is, but likely it would be better
> not to be pushing garbage paths into extraincludes and extralibs when
> some of those options aren't set.

Right, we need to correct something here. But this block does not need
any adjustment: it can just be deleted. A contrib module is added via
AddProject() in Solution.pm, and if the build is done with xml, xslt
or iconv their libraries and includes are added in any case, for any
declared project. So declaring a dependency with xml or xslt is just
moot work, and actually this would be broken without AddProject
because it is missing to list iconv.lib... At the same time, I think
that we could fix the inclusions with uuid for uuid-ossp, and just put
them as well in AddProject with the rest. Please see the updated
cleanup patch attached.
-- 
Michael
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index b3ed1f5..93dfd24 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -381,18 +381,7 @@ sub mkvcbuild
 	$zic->AddDirResourceFile('src/timezone');
 	$zic->AddReference($libpgcommon, $libpgport);
 
-	if ($solution->{options}->{xml})
-	{
-		$contrib_extraincludes->{'pgxml'} = [
-			$solution->{options}->{xml} . '/include',
-			$solution->{options}->{xslt} . '/include',
-			$solution->{options}->{iconv} . '/include' ];
-
-		$contrib_extralibs->{'pgxml'} = [
-			$solution->{options}->{xml} . '/lib/libxml2.lib',
-			$solution->{options}->{xslt} . '/lib/libxslt.lib' ];
-	}
-	else
+	if (!$solution->{options}->{xml})
 	{
 		push @contrib_excludes, 'xml2';
 	}
@@ -402,14 +391,7 @@ sub mkvcbuild
 		push @contrib_excludes, 'sslinfo';
 	}
 
-	if ($solution->{options}->{uuid})
-	{
-		$contrib_extraincludes->{'uuid-ossp'} =
-		  [ $solution->{options}->{uuid} . '/include' ];
-		$contrib_extralibs->{'uuid-ossp'} =
-		  [ $solution->{options}->{uuid} . '/lib/uuid.lib' ];
-	}
-	else
+	if (!$solution->{options}->{uuid})
 	{
 		push @contrib_excludes, 'uuid-ossp';
 	}
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 9cb1ad3..8217d06 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -37,12 +37,9 @@ sub _new
 	  unless exists $options->{float8byval};
 	die "float8byval not permitted on 32 bit platforms"
 	  if $options->{float8byval} && $bits == 32;
-	if ($options->{xml})
+	if ($options->{xslt} && !$options->{xml})
 	{
-		if (!($options->{xslt} && $options->{iconv}))
-		{
-			die "XML requires both XSLT and ICONV\n";
-		}
+		die "XSLT requires XML\n";
 	}
 	$options->{blocksize} = 8
 	  unless $options->{blocksize};# undef or 0 means default
@@ -555,6 +552,11 @@ sub AddProject
 		$proj->AddIncludeDir($self->{options}->{xslt} . '\include');
 		$proj->AddLibrary($self->{options}->{xslt} . '\lib\libxslt.lib');
 	}
+	if ($self->{options}->{uuid})
+	{
+		$proj->AddIncludeDir($self->{options}->{uuid} . '\include');
+		$proj->AddLibrary($self->{options}->{uuid} . '\lib\uuid.lib');
+	}
 	return $proj;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Override compile time log levels of specific messages/modules

2016-09-11 Thread Craig Ringer

On 11 Sep. 2016 11:31, "Jim Nasby"  wrote:

> Actually, I wish this was a straight-up logging level feature, because I
need it all the time when debugging complicated user-level code.
Specifically, I wish there was a GUC that would alter
(client|log)_min_messages upon entering a specific function, either for
just that function or for anything that function subsequently called.

We have that or close to it, but it's restricted for security. IIRC you
can't SET log_min_messages as non superuser including as part of CREATE
FUNCTION ... SET. Presumably because lowering it would let the user hide
activity admins expect to be recorded. Raising it would let them possibly
DoS via log spam - though that argument is rather undermined by the ability
to write plpgsql that does RAISE in a tight loop.

I'd like to be able to let users make logging more detailed but not less,
so they can only SET it to something equal to or more detailed to what
their session has art the start. Should actually be pretty trivial to
implement too.

You can then SET at session level, function level, SET LOCAL, etc. If you
want to control it dynamically via GUC you add calls to your functions to
check a custom user GUC and SET log level accordingly. Using whatever logic
you like.

> Bonus points if you could also specify a regex that the message text had
to match.
>

An errfinish() hook would be nice for that.

[HACKERS] Use nanosleep() for pg_usleep() on Unix/Linux?

2016-09-11 Thread Paul Guo

I happened to read the pg_usleep() code recently. I'm wondering
if we could implement it using the posix function nanosleep(),
instead of by select().

nanosleep() is designed with higher time resolution, besides it provide
remaining time if is interrupted by signal so that pg_usleep() could
be implemented more accurately.

The code for pg_usleep() could be similar like this:

while(nanosleep(,)==-1 && errno == EINTR)

  continue;


or combine with clock_gettime() to control the sleep time more accurately.


Regards,

Paul

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-11 Thread Mark Kirkwood


On 11/09/16 19:16, Mark Kirkwood wrote:




On 11/09/16 17:01, Amit Kapila wrote:

...Do you think we can do some read-only
workload benchmarking using this server?  If yes, then probably you
can use concurrent hash index patch [1] and cache the metapage patch
[2] (I think Mithun needs to rebase his patch) to do so.



[1] - 
https://www.postgresql.org/message-id/caa4ek1j6b8o4pcepqrxnyblvbftonmjeem+qn0jzx31-obx...@mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/cad__ouhj29cebif_flge4t9vj_-cfxbwcxhjo+d_16txbem...@mail.gmail.com





I can do - are we checking checking for hangs/assertions or comparing 
patched vs unpatched performance (for the metapage patch)?





So, assuming the latter - testing performance with and without the 
metapage patch:


For my 1st runs:

- cpus 16, ran 16G
- size 100, clients 32

I'm seeing no difference in performance for read only (-S) pgbench 
workload (with everybody using has indexes). I guess not that surprising 
as the db fites in ram (1.6G and we have 16G). So I'll retry with a 
bigger dataset (suspect size 2000 is needed).


regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-11 Thread Mark Kirkwood




On 11/09/16 17:01, Amit Kapila wrote:

On Sun, Sep 11, 2016 at 4:10 AM, Mark Kirkwood
 wrote:


performed several 10 hour runs on size 100 database using 32 and 64 clients.
For the last run I rebuilt with assertions enabled. No hangs or assertion
failures.


Thanks for verification.  Do you think we can do some read-only
workload benchmarking using this server?  If yes, then probably you
can use concurrent hash index patch [1] and cache the metapage patch
[2] (I think Mithun needs to rebase his patch) to do so.



[1] - 
https://www.postgresql.org/message-id/caa4ek1j6b8o4pcepqrxnyblvbftonmjeem+qn0jzx31-obx...@mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/cad__ouhj29cebif_flge4t9vj_-cfxbwcxhjo+d_16txbem...@mail.gmail.com




I can do - are we checking checking for hangs/assertions or comparing 
patched vs unpatched performance (for the metapage patch)?


regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Alter or rename enum value

2016-09-11 Thread Emre Hasegeli

> Why not just disallow dropping a value that's still in use? That's certainly
> what I would prefer to happen by default...

Of course, we should disallow it.  That problem is what to do next.
We cannot just remove the value, because it might still be referenced
from the inner nodes of the indexes.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

58 matches

Mail list logo