Re: [HACKERS] Bug in StartupSUBTRANS

2016-02-09 Thread Tom Lane
Simon Riggs  writes:
> Your patch looks right to me, so I will commit, barring objections... with
> backpatch. Likely to 9.0, AFAICS.

9.0 is out of support and should not be patched anymore.

I agree that the patch is basically correct, though I'd personally
write it without bothering with the extra variable:

+   /* must account for wraparound */
+   if (startPage > TransactionIdToPage(0x))
+   startPage = 0;

Also, the comment at line 45 is now wrong and needs an addition.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
> pg_dump -U $non-super_user
> 
> Should just work, period.

That ship has sailed already, where you're running a pg_dump against
objects you don't own and which have RLS enabled on them.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pl/pgSQL, get diagnostics and big data

2016-02-09 Thread Christian Ullrich
Ah, so it turns out I should have used the commitfest tool. My 
apologies; I will send the whole thing through that again. Please 
disregard the earlier message.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Joshua D. Drake

On 02/09/2016 12:05 PM, Robert Haas wrote:


That's true.  But I should also have an expectation that running
pg_dump won't trigger arbitrary code execution, which is why by
default, pg_dump sets row_security to OFF.  That way, if a row
security policy applies, I get an error rather than an incomplete,
possibly-invalid dump (and arbitrary code execution on the server
side).  If I'm OK with doing the dump subject to row security, I can
rerun with --enable-row-security.  But this proposal would force
non-superusers to always use that option, and that's not a good idea.



If I understand correctly what we are talking about here is:

1. If RLS is enabled and a non-super user issues a pg_dump, it will 
error unless I issue --enable-row-security


2. If RLS is not enabled and a non-super user issues a pg_dump the 
behavior is basically what it is now.


3. If RLS is enabled or not and I am a super user, it doesn't matter 
either way.


From my perspective, I should not have to enable row security as a 
non-super user to take a pg_dump. It should just work for what I am 
allowed to see. In other words:


pg_dump -U $non-super_user

Should just work, period.

Sincerely,

Joshua D. Drake

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Robert Haas
On Fri, Jan 15, 2016 at 11:53 AM, Stephen Frost  wrote:
>> Whereupon you'd have no certainty that what you got represented a
>> complete dump of your own data.
>
> It would be a dump of what you're allowed to see, rather than an error
> saying you couldn't dump something you couldn't see, which is the
> alternative we're talking about here.  Even if you've got a dependency
> to something-or-other, if you don't have access to it, then you're
> going to get an error.

I think you're dismissing Tom's concerns far too lightly.  The
row_security=off mode, which is the default, becomes unusable for
non-superusers under this proposal.  That's bad.  And if you switch to
the other mode, then you might accidentally fail to get all of the
data in some table you're trying to back up.  That's bad too: that's
why it isn't the default.  There's a big difference between saying
"I'm OK with not dumping the tables I can't see" and "I'm OK with not
dumping all of the data in some table I *can* see".

It seems to me that there's a big difference between policies we ship
out of the box and policies that are created be users: specifically,
the former can be assumed benign, while the latter can't.  I think
that difference matters here, although I'm not sure exactly where to
go with it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 3:01 PM, Joe Conway  wrote:
> On 02/09/2016 11:47 AM, Robert Haas wrote:
>> On Fri, Jan 15, 2016 at 11:53 AM, Stephen Frost  wrote:
 Whereupon you'd have no certainty that what you got represented a
 complete dump of your own data.
>>>
>>> It would be a dump of what you're allowed to see, rather than an error
>>> saying you couldn't dump something you couldn't see, which is the
>>> alternative we're talking about here.  Even if you've got a dependency
>>> to something-or-other, if you don't have access to it, then you're
>>> going to get an error.
>>
>> I think you're dismissing Tom's concerns far too lightly.  The
>> row_security=off mode, which is the default, becomes unusable for
>> non-superusers under this proposal.  That's bad.  And if you switch to
>> the other mode, then you might accidentally fail to get all of the
>> data in some table you're trying to back up.  That's bad too: that's
>> why it isn't the default.  There's a big difference between saying
>> "I'm OK with not dumping the tables I can't see" and "I'm OK with not
>> dumping all of the data in some table I *can* see".
>
> I don't grok what you're saying here. If I, as a non-superuser could
> somehow see all the rows in a table just by running pg_dump, including
> rows that I could not normally see due to RLS policies, *that* would be
> bad. I should have no expectation of being able to dump rows I can't
> normally see.

That's true.  But I should also have an expectation that running
pg_dump won't trigger arbitrary code execution, which is why by
default, pg_dump sets row_security to OFF.  That way, if a row
security policy applies, I get an error rather than an incomplete,
possibly-invalid dump (and arbitrary code execution on the server
side).  If I'm OK with doing the dump subject to row security, I can
rerun with --enable-row-security.  But this proposal would force
non-superusers to always use that option, and that's not a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-02-09 Thread Josh Kupershmidt
On Tue, Feb 9, 2016 at 2:16 PM, Filip Rembiałkowski
 wrote:
> But then it becomes disputable if SQL syntax change makes sense.
>
> ---we had this,
>  NOTIFY channel [ , payload ]
> ---and in this patch we have this
> NOTIFY [ ALL | DISTINCT ] channel [ , payload ]
>  ---  but maybe we should have this?
> NOTIFY channel [ , payload [ , mode ] ]

I think using ALL to mean "don't worry about de-duplication" could be
a bit confusing, especially as there was some interest recently in
supporting wildcard notifications:
http://www.postgresql.org/message-id/52693fc5.7070...@gmail.com

and conceivably we might want to support a way to notify all
listeners, i.e. NOTIFY * as proposed in that thread. If we ever
supported wildcard notifies, ALL may be easily confused to mean "all
channel names".

What about adopting the options-inside-parentheses format, the way
EXPLAIN does nowadays, something like:

NOTIFY (DEDUPLICATE FALSE, MODE IMMEDIATE) mychannel;

Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-02-09 Thread Tom Lane
Josh Kupershmidt  writes:
> On Tue, Feb 9, 2016 at 2:16 PM, Filip Rembiałkowski
>  wrote:
>> But then it becomes disputable if SQL syntax change makes sense.
>> 
>> ---we had this,
>> NOTIFY channel [ , payload ]
>> ---and in this patch we have this
>> NOTIFY [ ALL | DISTINCT ] channel [ , payload ]
>> ---  but maybe we should have this?
>> NOTIFY channel [ , payload [ , mode ] ]

> What about adopting the options-inside-parentheses format, the way
> EXPLAIN does nowadays, something like:
> NOTIFY (DEDUPLICATE FALSE, MODE IMMEDIATE) mychannel;

FWIW, I think it would be a good thing if the NOTIFY statement syntax were
not remarkably different from the syntax used in the pg_notify() function
call.  To do otherwise would certainly be confusing.  So on the whole
I'd go with the "NOTIFY channel [ , payload [ , mode ] ]" option.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
 wrote:
> Thanks Jeevan for your review and comments. PFA the patch which fixes those.

Committed with a couple more small adjustments.

Woohoo, finally!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
I wrote:
> In any case, we should proceed with fixing things so that buildfarm owners
> can specify a higher shutdown timeout for especially slow critters.

I looked into doing this as I suggested yesterday, namely modifying the
buildfarm scripts, and soon decided that it would be a mess; there are
too many cases where "pg_ctl stop" is not invoked directly by the script.

I'm now in favor of applying the PGCTLTIMEOUT patch Noah proposed, and
*removing* the two existing hacks in run_build.pl that try to force -t 120.

The only real argument I can see against that approach is that we'd have
to back-patch the PGCTLTIMEOUT patch to all active branches if we want
to stop the buildfarm failures.  We don't usually back-patch feature
additions.  On the other hand, this wouldn't be the first time we've
back-patched something on grounds of helping the buildfarm, so I find
that argument pretty weak.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Feb 9, 2016 at 3:01 PM, Joe Conway  wrote:
> > On 02/09/2016 11:47 AM, Robert Haas wrote:
> >> On Fri, Jan 15, 2016 at 11:53 AM, Stephen Frost  wrote:
>  Whereupon you'd have no certainty that what you got represented a
>  complete dump of your own data.
> >>>
> >>> It would be a dump of what you're allowed to see, rather than an error
> >>> saying you couldn't dump something you couldn't see, which is the
> >>> alternative we're talking about here.  Even if you've got a dependency
> >>> to something-or-other, if you don't have access to it, then you're
> >>> going to get an error.
> >>
> >> I think you're dismissing Tom's concerns far too lightly.  The
> >> row_security=off mode, which is the default, becomes unusable for
> >> non-superusers under this proposal.  That's bad.  And if you switch to
> >> the other mode, then you might accidentally fail to get all of the
> >> data in some table you're trying to back up.  That's bad too: that's
> >> why it isn't the default.  There's a big difference between saying
> >> "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> >> dumping all of the data in some table I *can* see".
> >
> > I don't grok what you're saying here. If I, as a non-superuser could
> > somehow see all the rows in a table just by running pg_dump, including
> > rows that I could not normally see due to RLS policies, *that* would be
> > bad. I should have no expectation of being able to dump rows I can't
> > normally see.
> 
> That's true.  But I should also have an expectation that running
> pg_dump won't trigger arbitrary code execution, which is why by
> default, pg_dump sets row_security to OFF.  That way, if a row
> security policy applies, I get an error rather than an incomplete,
> possibly-invalid dump (and arbitrary code execution on the server
> side).  If I'm OK with doing the dump subject to row security, I can
> rerun with --enable-row-security.  But this proposal would force
> non-superusers to always use that option, and that's not a good idea.

Arbitrary code execution is quite a different concern from the prior
concern regarding incomplete dumps.

To the extent that untrusted code execution is an issue (and my
experience with environments which would deploy RLS tells me that it
isn't a practical concern), an option could be created which would cause
an error to be thrown on non-catalog RLS being run.

When it comes to multi-tenancy environments, as this thread is about,
chances are the only tables you can see are ones which you own or are
owned by a trusted user, which is why I don't view this as a pratical
concern, but I'm not against having a solution to address the issue
raised regarding arbitrary code execution, provided it doesn't create
more problems than it purports to solve.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: postgres_fdw: Push down joins to remote servers.

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 3:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> postgres_fdw: Push down joins to remote servers.
>
> The buildfarm is not very impressed with this patch.

Well, I guess that's why we have regression tests.  It looks like that
test case introduces a dependency on particular CTID values that
wasn't there before.  That doesn't seem like a very good idea; I'll
remove that test.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
I wrote:
> Noah Misch  writes:
>> On Mon, Feb 08, 2016 at 02:15:48PM -0500, Tom Lane wrote:
>>> We've seen variants
>>> on this theme on half a dozen machines just in the past week --- and it
>>> seems to mostly happen in 9.5 and HEAD, which is fishy.

>> It has been affecting only the four AIX animals, which do share hardware.
>> (Back in 2015 and once in 2016-01, it did affect axolotl and shearwater.)

> Certainly your AIX critters have shown this a bunch, but here's another
> current example:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2016-02-08%2014%3A49%3A23

While we've not seen an actual test failure since I installed the tracing
code, we do have reports from the four AIX critters, and they are pretty
instructive.  See for example
http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet=2016-02-09%2016%3A27%3A05=stopdb-C-2
which is

Snapshot: 2016-02-09 16:27:05

waiting for server to shut down at 2016-02-09 18:04:09.159 
UTC
 done
server stopped
=== db log file ==
2016-02-09 18:04:09.159 UTC [56ba2944.ee01bc:2] LOG:  received fast shutdown 
request at 2016-02-09 18:04:09.159 UTC
2016-02-09 18:04:09.159 UTC [56ba2944.ee01bc:3] LOG:  aborting any active 
transactions
2016-02-09 18:04:09.160 UTC [56ba2944.14501d8:2] LOG:  autovacuum launcher 
shutting down at 2016-02-09 18:04:09.160 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:40] LOG:  shutting down at 
2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:41] LOG:  CheckPointGuts starting 
at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:42] LOG:  CheckPointCLOG() done 
at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:43] LOG:  CheckPointCommitTs() 
done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:44] LOG:  CheckPointSUBTRANS() 
done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:45] LOG:  CheckPointMultiXact() 
done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:46] LOG:  CheckPointPredicate() 
done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:47] LOG:  CheckPointRelationMap() 
done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.258 UTC [56ba2944.363000e:48] LOG:  
CheckPointReplicationSlots() done at 2016-02-09 18:04:09.258 UTC
2016-02-09 18:04:09.259 UTC [56ba2944.363000e:49] LOG:  CheckPointSnapBuild() 
done at 2016-02-09 18:04:09.259 UTC
2016-02-09 18:04:09.259 UTC [56ba2944.363000e:50] LOG:  
CheckPointLogicalRewriteHeap() done at 2016-02-09 18:04:09.259 UTC
2016-02-09 18:04:09.259 UTC [56ba2944.363000e:51] LOG:  BufferSync(5) beginning 
to write 632 buffers at 2016-02-09 18:04:09.259 UTC
2016-02-09 18:04:09.966 UTC [56ba2944.363000e:52] LOG:  BufferSync(5) done, 
wrote 632/632 buffers at 2016-02-09 18:04:09.966 UTC
2016-02-09 18:04:09.966 UTC [56ba2944.363000e:53] LOG:  CheckPointBuffers() 
done at 2016-02-09 18:04:09.966 UTC
2016-02-09 18:04:09.966 UTC [56ba2944.363000e:54] LOG:  
CheckPointReplicationOrigin() done at 2016-02-09 18:04:09.966 UTC
2016-02-09 18:04:09.966 UTC [56ba2944.363000e:55] LOG:  CheckPointGuts done at 
2016-02-09 18:04:09.966 UTC
2016-02-09 18:04:09.967 UTC [56ba2944.363000e:56] LOG:  checkpoint WAL record 
flushed at 2016-02-09 18:04:09.967 UTC
2016-02-09 18:04:09.967 UTC [56ba2944.363000e:57] LOG:  pg_control updated at 
2016-02-09 18:04:09.967 UTC
2016-02-09 18:04:29.375 UTC [56ba2944.363000e:58] LOG:  in mdpostckpt, 3072 
unlinks remain to do at 2016-02-09 18:04:29.375 UTC
2016-02-09 18:04:48.207 UTC [56ba2944.363000e:59] LOG:  in mdpostckpt, 2048 
unlinks remain to do at 2016-02-09 18:04:48.207 UTC
2016-02-09 18:05:07.381 UTC [56ba2944.363000e:60] LOG:  in mdpostckpt, 1024 
unlinks remain to do at 2016-02-09 18:05:07.381 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:61] LOG:  smgrpostckpt() done at 
2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:62] LOG:  RemoveOldXlogFiles() 
done at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:63] LOG:  TruncateSUBTRANS() done 
at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:64] LOG:  shutdown checkpoint 
complete at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:65] LOG:  ShutdownCLOG() complete 
at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:66] LOG:  ShutdownCommitTs() 
complete at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:67] LOG:  ShutdownSUBTRANS() 
complete at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:25.764 UTC [56ba2944.363000e:68] LOG:  database system is shut 
down at 2016-02-09 18:05:25.764 UTC
2016-02-09 18:05:26.165 UTC [56ba2944.ee01bc:4] LOG:  lock files all released 
at 2016-02-09 18:05:26.165 UTC

This says that the bulk of the 

[HACKERS] ALTER EXTENSION DROP FUNCTION not working ?

2016-02-09 Thread Sandro Santilli
I'm seeing an issue with ALTER EXTENSION DROP FUNCTION
not fully unregistering the dependency of the function
on the extension. Commands to reproduce described here:
https://trac.osgeo.org/postgis/ticket/3450#comment:23

Basically I'm getting:

 ERROR:  cannot drop function pgis_twkb_accum_finalfn(internal) because other 
objects depend on it
 DETAIL:  extension postgis depends on function 
pgis_twkb_accum_finalfn(internal)

Right after successfully running:

 ALTER EXTENSION postgis DROP FUNCTION pgis_twkb_accum_finalfn(internal);

Is it a bug in PostgreSQL ? (9.3.6 running on the test machine).

--strk;

  ()   Free GIS & Flash consultant/developer
  /\   http://strk.keybit.net/services.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-09 Thread Fabien COELHO


I think I would appreciate comments to understand why/how the 
ringbuffer is used, and more comments in general, so it is fine if you 
improve this part.


I'd suggest to leave out the ringbuffer/new bgwriter parts.


Ok, so the patch would only onclude the checkpointer stuff.

I'll look at this part in detail.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
Hi

Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.
>
> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly. I can understand the need to
> sort the colH values, but it seems to me that the result rows should
> just be returned in the order the server returns them -- i.e., I don't
> think we should allow sorting colV values client-side, overriding a
> server-side ORDER BY clause in the query.
>

This feature has zero relation with \x option, and any link to this option
is confusing. This is important, elsewhere we are on start again, where I
did long discuss with Daniel about the name, when I blocked the name
"rotate".


> Client-side sorting makes me uneasy in general, and I think it should
> be restricted to just sorting the columns that appear in the output
> (the colH values). This would also allow the syntax to be simplified:
>
> \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]
>

The sorting on client side is necessary - minimally in one direction,
because you cannot to create perfect sorting for both dimensions.
Possibility to order in second dimension is just pretty comfortable -
because you don't need to think two steps forward - when you create SQL
query.

I have a basic use case that should be supported well, and it is supported
well by last version of this patch. The evaluation of syntax is subjective.
We can compare Daniel's syntax and your proposal.

The use case: I have a table with the invoices with attributes (date, name
and amount). I would to take a report of amounts across months and
customers. Horizontal dimension is month (name), vertical dimension is name
of customers. I need sorting of months in semantic order and customers in
alphabet order.

So my query is:

SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;

and crosstabview command (per Daniel proposal)

\crosstabview +name  +month:month_order amount

But if I don't need column header in human readable form, I can do

\crosstabview +name +month_order amount

What is solution of this use case with your proposal??

I agree so this syntax is pretty raw. But it is consistent with other psql
statements and there are not possible conflicts.

What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
[colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
column, I have to enter one or more colG1,... or I have to enter explicitly
asc, desc keyword.

Regards

Pavel








>
> Overall, I like the feature, but I'm not convinced that it's ready in
> its current form.
>
> For the future (not in this first version of the patch), since the
> transformation is more than just a \x-type formatting of the query
> results, a nice-to-have feature would be a way to save the results
> somewhere -- say by making it play nicely with \g or \copy somehow,
> but I admit that I don't know exactly how that would work.
>
> Regards,
> Dean
>


Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-09 Thread Andres Freund
On February 9, 2016 10:46:34 AM GMT+01:00, Fabien COELHO  
wrote:
>
>>> I think I would appreciate comments to understand why/how the 
>>> ringbuffer is used, and more comments in general, so it is fine if
>you 
>>> improve this part.
>>
>> I'd suggest to leave out the ringbuffer/new bgwriter parts.
>
>Ok, so the patch would only onclude the checkpointer stuff.
>
>I'll look at this part in detail.

Yes, that's the more pressing part. I've seen pretty good results with the new 
bgwriter, but it's not really worthwhile until sorting and flushing is in...

Andres 

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 05:24, Pavel Stehule  wrote:
> I have not a feeling so we did some with Daniel privately. All work was
> public (I checked my mailbox) - but what is unhappy - in more mailing list
> threads (not sure how it is possible, because subjects looks same). The
> discus about the design was public, I am sure. It was relative longer
> process, with good progress (from my perspective), because Daniel accepts
> and fixed all my objection. The proposed syntax is fully consistent with
> other psql commands - hard to create something new there, because psql
> parser is pretty limited. Although I am thinking so syntax is good, clean
> and useful I am open to discuss about it. Please, try the last design, last
> patch - I spent lot of hours (and I am sure so Daniel much more) in thinking
> how this can be designed better.
>

Looking at this patch, I have mixed feelings about it. On the one hand
I really like the look of the output, and I can see that the non-fixed
nature of the output columns makes this hard to achieve server-side.

But on the other hand, this seems to be going way beyond the normal
level of result formatting that something like \x does, and I find the
syntax for sorting particularly ugly. I can understand the need to
sort the colH values, but it seems to me that the result rows should
just be returned in the order the server returns them -- i.e., I don't
think we should allow sorting colV values client-side, overriding a
server-side ORDER BY clause in the query.

Client-side sorting makes me uneasy in general, and I think it should
be restricted to just sorting the columns that appear in the output
(the colH values). This would also allow the syntax to be simplified:

\crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]

Overall, I like the feature, but I'm not convinced that it's ready in
its current form.

For the future (not in this first version of the patch), since the
transformation is more than just a \x-type formatting of the query
results, a nice-to-have feature would be a way to save the results
somewhere -- say by making it play nicely with \g or \copy somehow,
but I admit that I don't know exactly how that would work.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER EXTENSION DROP FUNCTION not working ?

2016-02-09 Thread Sandro Santilli
On Tue, Feb 09, 2016 at 10:33:42AM +0100, Sandro Santilli wrote:

> https://trac.osgeo.org/postgis/ticket/3450#comment:23
> 
> Basically I'm getting:
> 
>  ERROR:  cannot drop function pgis_twkb_accum_finalfn(internal) because other 
> objects depend on it
>  DETAIL:  extension postgis depends on function 
> pgis_twkb_accum_finalfn(internal)

Figured:  the "pgis_twkb_accum_finalfn(internal)" function is not
a _direct_ dependency of extension "postgis", but is needed for
an aggregate which is still registered.

So this is more an annoyance than a bug, being the non-clear error
message about what's the direct dependent object that prevents
the drop.

--strk;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 10:09, Pavel Stehule  wrote:
> The sorting on client side is necessary - minimally in one direction,
> because you cannot to create perfect sorting for both dimensions.
> Possibility to order in second dimension is just pretty comfortable -
> because you don't need to think two steps forward - when you create SQL
> query.
>
> I have a basic use case that should be supported well, and it is supported
> well by last version of this patch. The evaluation of syntax is subjective.
> We can compare Daniel's syntax and your proposal.
>
> The use case: I have a table with the invoices with attributes (date, name
> and amount). I would to take a report of amounts across months and
> customers. Horizontal dimension is month (name), vertical dimension is name
> of customers. I need sorting of months in semantic order and customers in
> alphabet order.
>
> So my query is:
>
> SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
>
> and crosstabview command (per Daniel proposal)
>
> \crosstabview +name  +month:month_order amount
>
> But if I don't need column header in human readable form, I can do
>
> \crosstabview +name +month_order amount
>
> What is solution of this use case with your proposal??
>

So it would just be

SELECT name,
   to_char(date, 'mon') AS month,
   sum(amount) AS amount,
   extract(month from date) AS month_order
 FROM invoices
 GROUP BY 1,2,3
 ORDER BY name
\crosstabview name month amount month_order

Note that I might also want to pass additional sort options, such as
"ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
In the new syntax, such sort options could be trivially supported in
both the server- and client-side sorts:

SELECT name, to_char(date, 'mon') AS month,
   extract(month from date) AS month_order, sum(amount) AS amount
  FROM invoices
 GROUP BY 1,2,3
 ORDER BY name NULLS LAST
\crosstabview name month amount month_order asc nulls last

This is probably not an issue in this example, but it might well be in
other cases. The +/-scol syntax is always going to be limited in what
it can support.


> I agree so this syntax is pretty raw. But it is consistent with other psql
> statements and there are not possible conflicts.
>
> What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
> column, I have to enter one or more colG1,... or I have to enter explicitly
> asc, desc keyword.
>

That is resolved by the comma that precedes colG2, etc. isn't it?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
> >
> > SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> > month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
> >
> > and crosstabview command (per Daniel proposal)
> >
> > \crosstabview +name  +month:month_order amount
> >
> > But if I don't need column header in human readable form, I can do
> >
> > \crosstabview +name +month_order amount
> >
> > What is solution of this use case with your proposal??
> >
>
> So it would just be
>
> SELECT name,
>to_char(date, 'mon') AS month,
>sum(amount) AS amount,
>extract(month from date) AS month_order
>  FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name
> \crosstabview name month amount month_order
>

Warning: :) Now I am subjective. The Daniel syntax "\crosstabview +name
+month:month_order amount" looks more readable for me, because related
things are near to self.


>
> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:


> SELECT name, to_char(date, 'mon') AS month,
>extract(month from date) AS month_order, sum(amount) AS amount
>   FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name NULLS LAST
> \crosstabview name month amount month_order asc nulls last
>

I understand - if I compare these two syntaxes I and I am trying be
objective, then I see

your:
  + respect SQL clauses ordering, allows pretty complex ORDER BY clause
  - possible to fail on unexpected syntax errors
  +/- more verbose
  - allow only one client side sort
  - less expressive

Daniel:
  + cannot to fail on syntax error
  + more compacts (not necessary to specify ORDER BY clauses)
  + allow to specify sort in both dimensions
  + more expressive (+colH is more expressive than colV colH col colH
  - doesn't allow to complex order clauses in both dimensions


>
> This is probably not an issue in this example, but it might well be in
> other cases. The +/-scol syntax is always going to be limited in what
> it can support.
>

the +/- syntax can be enhanced by additional attributes - this is only
syntax (but then there is a risk of possible syntax errors)


>
>
> > I agree so this syntax is pretty raw. But it is consistent with other
> psql
> > statements and there are not possible conflicts.
> >
> > What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> > [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort
> order
> > column, I have to enter one or more colG1,... or I have to enter
> explicitly
> > asc, desc keyword.
> >
>
> That is resolved by the comma that precedes colG2, etc. isn't it?
>

but colG1 is optional. What if you miss any colGx ?

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Jeevan Chalke
Hi,

I have reviewed the patch and it looks good to me.
make/make install/make check is fine (when done without -Wall -Werror).

Here are few comments:
1.
With -Wall -Werror, I see couple of warnings:

postgres_fdw.c: In function ‘estimate_path_cost_size’:
postgres_fdw.c:2248:13: error: ‘run_cost’ may be used uninitialized in this
function [-Werror=uninitialized]
postgres_fdw.c: In function ‘conversion_error_callback’:
postgres_fdw.c:3832:6: error: ‘attname’ may be used uninitialized in this
function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [postgres_fdw.o] Error 1

2. Typo:
scna_clauses => scan_clauses

3. Does this new addition requires documentation?


I did not see any issues with my testing. Code changes are good too.
Patch has very good test-cases testing everything required. Nice work.

Thanks.

On Mon, Feb 8, 2016 at 7:11 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita  > wrote:
>
>> On 2016/02/05 17:50, Ashutosh Bapat wrote:
>>
>> Btw, IIUC, I think the patch fails to adjust the targetlist of the
>>> top plan created that way, to output the fdw_scan_tlist, as
>>> discussed in [1] (ie, I think the attached patch is needed, which is
>>> created on top of your patch pg_fdw_join_v8.patch).
>>>
>>
>> fdw_scan_tlist represents the output fetched from the foreign server and
>>> is not necessarily the output of ForeignScan. ForeignScan node's output
>>> is represented by tlist argument to.
>>>
>>> 1119 return make_foreignscan(tlist,
>>> 1120 local_exprs,
>>> 1121 scan_relid,
>>> 1122 params_list,
>>> 1123 fdw_private,
>>> 1124 fdw_scan_tlist,
>>> 1125 remote_exprs,
>>> 1126 outer_plan);
>>>
>>> This tlist is built using build_path_tlist() for all join plans. IIUC,
>>> all of them output the same targetlist. We don't need to make sure that
>>> targetlist match as long as we are using the targetlist passed in by
>>> create_scan_plan(). Do you have a counter example?
>>>
>>
>> Maybe my explanation was not correct, but I'm saying that the targertlist
>> of the above outer_plan should be set to the fdw_scan_tlist, to avoid
>> misbehavior.  Here is such an example (add() in the example is a user
>> defined function that simply adds two arguments, defined by: create
>> function add(integer, integer) returns integer as '/path/to/func', 'add'
>> language c strict):
>>
>> postgres=# create foreign table foo (a int) server myserver options
>> (table_name 'foo');
>> postgres=# create foreign table bar (a int) server myserver options
>> (table_name 'bar');
>> postgres=# create table tab (a int, b int);
>> postgres=# insert into foo select a from generate_series(1, 1000) a;
>> postgres=# insert into bar select a from generate_series(1, 1000) a;
>> postgres=# insert into tab values (1, 1);
>> postgres=# analyze foo;
>> postgres=# analyze bar;
>> postgres=# analyze tab;
>>
>> [Terminal 1]
>> postgres=# begin;
>> BEGIN
>> postgres=# update tab set b = b + 1 where a = 1;
>> UPDATE 1
>>
>> [Terminal 2]
>> postgres=# explain verbose select tab.* from tab, foo, bar where foo.a =
>> bar.a and add(foo.a, bar.a) > 0 limit 1 for update;
>>
>> QUERY PLAN
>>
>>
>> 
>> -
>>  Limit  (cost=100.00..107.70 rows=1 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  LockRows  (cost=100.00..2663.48 rows=333 width=70)
>>  Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>  ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)
>>  Output: foo.*, bar.*
>>  Filter: (add(foo.a, bar.a) > 0)
>>  Relations: (public.foo) INNER JOIN (public.bar)
>>  Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a
>> FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a =
>> r3.a)) F
>> OR UPDATE OF r2 FOR UPDATE OF r3
>>  ->  Hash Join  (cost=247.50..301.25 rows=333
>> width=56)
>>Output: foo.*, bar.*
>>Hash Cond: (foo.a = bar.a)
>>Join Filter: (add(foo.a, bar.a) > 0)
>>->  Foreign Scan on public.foo
>> (cost=100.00..135.00 rows=1000 width=32)
>>  Output: foo.*, foo.a
>>  Remote SQL: SELECT a FROM public.foo FOR
>> UPDATE
>>->  Hash  

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

Sure, I'm on it.

> I'm closing this as returned-with-feedback for now.

Well,  the feedback it got during months was incorporated into
the patch in the form of significant improvements, and
at the end of this CF it was at the point that it has really been
polished, and no other feedback was coming.

I'll resubmit.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Marko Tiikkaja

On 08/02/16 14:16, Pavel Stehule wrote:

2016-02-08 13:53 GMT+01:00 Marko Tiikkaja :


Yeah, and that's exactly what I don't want, because that means that CREATE
SCHEMA VARIABLE suddenly breaks existing code.



theoretically yes, but this conflict can be 100% detected - so no quiet bug
is possible, and plpgsql_check can find this issue well. If you don't use
schema variable, then your code will be correct. You have to explicitly
create the variable, and if there will be any problem, then the problem
will be only in PL functions in one schema. And you can identify it by
statical analyse.


I'm sorry, but I think you've got your priorities completely backwards. 
 You're saying that it's OK to add a footgun because blown-off pieces 
of feet can be found by using a third party static analyzer barely 
anyone uses.  And at best, that footgun is only a very minor convenience 
(though I'd argue that omitting it actually hurts readability).


That makes absolutely no sense to me at all.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Pavel Stehule
2016-02-09 15:32 GMT+01:00 Marko Tiikkaja :

> On 08/02/16 14:16, Pavel Stehule wrote:
>
>> 2016-02-08 13:53 GMT+01:00 Marko Tiikkaja :
>>
>>>
>>> Yeah, and that's exactly what I don't want, because that means that
>>> CREATE
>>> SCHEMA VARIABLE suddenly breaks existing code.
>>>
>>>
>> theoretically yes, but this conflict can be 100% detected - so no quiet
>> bug
>> is possible, and plpgsql_check can find this issue well. If you don't use
>> schema variable, then your code will be correct. You have to explicitly
>> create the variable, and if there will be any problem, then the problem
>> will be only in PL functions in one schema. And you can identify it by
>> statical analyse.
>>
>
> I'm sorry, but I think you've got your priorities completely backwards.
> You're saying that it's OK to add a footgun because blown-off pieces of
> feet can be found by using a third party static analyzer barely anyone
> uses.  And at best, that footgun is only a very minor convenience (though
> I'd argue that omitting it actually hurts readability).
>

I don't block the integration plpgsql_check to upstream. I spent hundreds
hours for it.

Can we look on this problem with different side? What I can do it for safe
using proposed schema variables.

The possible ways:

1. requirement prefix like : or @. I don't prefer it because a) hard to
find a agreement - Oracle fans like ":", MSSQL like @, other maybe $, b)
with any currently unsupported syntax I have to fix SQL lexer, parser

2. requirement to use qualified name everywhere - it can works, but I don't
prefer it, because sometimes can be unfunny to write long qualified
identifiers. There are not aliases on schema in PLpgSQL. Possible solved by
variable aliases. But it requires alias.

3. plpgsql GUC where schema variables are: a) disabled, b) enabled, c) only
qualified names are allowed - it is similar to #variable_conflict option

I prefer @3 with "c" as default, but I can live with @2, and dislike @1 due
mentioned reasons.

Can you be satisfied by any mentioned variant?

Regards

Pavel


>
> That makes absolutely no sense to me at all.
>
>
> .m
>


Re: [HACKERS] ALTER EXTENSION DROP FUNCTION not working ?

2016-02-09 Thread Tom Lane
Sandro Santilli  writes:
> On Tue, Feb 09, 2016 at 10:33:42AM +0100, Sandro Santilli wrote:
>> https://trac.osgeo.org/postgis/ticket/3450#comment:23
>> 
>> Basically I'm getting:
>> ERROR:  cannot drop function pgis_twkb_accum_finalfn(internal) because other 
>> objects depend on it
>> DETAIL:  extension postgis depends on function 
>> pgis_twkb_accum_finalfn(internal)

> Figured:  the "pgis_twkb_accum_finalfn(internal)" function is not
> a _direct_ dependency of extension "postgis", but is needed for
> an aggregate which is still registered.

> So this is more an annoyance than a bug, being the non-clear error
> message about what's the direct dependent object that prevents
> the drop.

I believe this is an intentional design decision: if you try to drop
something that an extension member object depends on, we complain about
the extension, not about the member object.  This is on the grounds
that a user of an extension likely doesn't know or care exactly what
all the extension members are, and would rather think of the extension
as a unit.  Consider for example that if you'd said CASCADE, the command
would have led to dropping the whole extension; would you want each
member object of the extension to be called out explicitly as a drop
cascade target?

I agree that it's a bit unintuitive when you're the developer of the
extension, but I doubt we're going to revisit this choice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.

Yes,  that implicit "X" with 2 column resultsets is not essential, it may
be removed without real damage.

About the possible suggestion to have a \pset unicode_crosstab_marker,
my opinion would be that it's not important enough to justify a new
\pset setting.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> While I understand that you may think that "silence is consent",
> what I am afraid of is that some committer will look at this two
> months from now and say "I hate this Hcol+ stuff, -1 from me" and
> send the patch back for syntax rework.  IMO it's better to have more
> people chime in here so that the patch that we discuss during the
> next commitfest is really the best one we can think of.

Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.

Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2016-02-09 Thread Noah Misch
On Wed, Dec 30, 2015 at 08:16:28PM +1300, David Rowley wrote:
> On [1] I suggested an idea to make improvements to the planner around the
> Equivalence Class code. Later in [2] Tom raised concerns with this adding
> too many planning cycles for a perhaps not common enough situation.  I
> don't want to discuss that particular patch here, I want to discuss more
> generally about the dilemma about adding more smarts to the planner to
> allow it to generate a more optimal plan in order to save on execution time.

It's an important topic.

> A number of ideas were suggested on the other thread about how we might go
> about solving this problem. In [3] Simon talked about perhaps enabling
> extra optimisations when the planner sees that the plan will cost more than
> some given threshold. That's perhaps an option, but may not work well for
> optimisations which must take place very early in planning, for example [4].

Yeah.  A slew of steps precede us assembling a notion of total cost.  I bet
most controversial proposed steps will happen in that early period.  We'd need
a rough cost earlier than we get it today, and I don't know what achieving
that would look like.  Simon, did you have specifics in view?

> Another idea which came up was from Evgeniy [5], which was more of a
> request not to do it this way, but never-the-less, the idea was basically
> to add lots of GUCs to enable/disable each extra planner feature.

This and subsequent ideas each envision some kind of optimization step
blacklist.  Suppose it's a bitmap with one bit per optional optimizer step.
Each idea chooses blacklist membership differently, but the planner acts on
the blacklist about the same way.  I paraphrase the ideas in those terms
below, and I offer a couple more.  For this idea, the blacklist is simple:

1. User defines the blacklist fully.

It's essentially handing the hard part back to the user.  While I sympathize
with allergic reactions to this, I like it far better than rejecting
optimizations based on thinking the average case wants them disabled.
work_mem likewise isn't the ideal knob for any user, but it has a simple
implementation and beats "1MB per hash table is okay for everyone."

> Another option which I've thought about previously was a planner_strength
> GUC, at which various additional optimisations are enabled at various
> predefined strength levels, so that databases which tend to spend a great
> deal more execution time compared to planning time can turn this up a bit
> to see if that helps change that ratio a bit.  This idea is far from

2. User selects from predefined blacklists like "maximum", "default", etc.

> perfect though, as who's to say that planner feature X should "kick in"
> before planner feature Y? I've also often thought that it might be nice to

Yep.  It will be essentially impossible to build an argument to move a planner
feature from one strength to another.  If the feature's committer has
personally experienced the problem in the field, the optimization will end up
active at default planner strength.

> have it so the planner does not modify the Parse object, so that the
> planner has the option to throw away what it's done so far and start
> planning all over again with the "planner_strength" knob turned up to the
> maximum, if the cost happened to indicate that the query was going to take
> a long time to execute.

3. System first plans with a specific predefined blacklist that omits
   speculative (low probability, high payoff) steps.  If that yields a
   high-cost plan, it repeats planning with an empty blacklist.

I agree that the planner's modification of the Query tree is a substantial
roadblock.  The design needs to work for one-shot plans, and we're unlikely to
recoup the cost of copying every one-shot Query tree.

> So here I'd very much like to kick off discussion on an acceptable way to
> solve this problem, in a realistic way which we're all happy with.

It's bad to add 10us per plan times 3000/s/client, but the same waste once per
minute per client is no cause for concern.  I advise accepting speculative
planner optimizations, enabled by default when similar in cost to comparable
existing steps.  At the same time, I encourage developing automation to cap
the waste when a client elicits millions of planner runs that don't benefit
from a certain optimization.  Specific lines of inquiry:

4. Derive a conservative blacklist for each rewritten Query when first
   planning it, and use that blacklist for subsequent plans.

Some prepared statements use a custom plan every time.  (Constraint exclusion
is one driver of such cases.)  Many facets of a Query's planning problem don't
depend on parameter values, so a particular optimization will apply to all the
custom plans or to none of them.  Let the first plan build a blacklist of
provably-irrelevant optimizations, which the plan cache stores and furnishes
to later runs.  The first plan after an invalidation recomputes the blacklist.

5. Add a 

Re: [HACKERS] Existence check for suitable index in advance when concurrently refreshing.

2016-02-09 Thread Michael Paquier
On Tue, Feb 9, 2016 at 4:27 PM, Fujii Masao  wrote:
> Thanks for updating the patch!
> Attached is the updated version of the patch.
> I removed unnecessary assertion check and change of source code
> that you added, and improved the source comment.
> Barring objection, I'll commit this patch.

So, this code basically duplicates what is already in
refresh_by_match_merge to check if there is a UNIQUE index defined. If
we are sure that an error is detected earlier in the code as done in
this patch, wouldn't it be better to replace the error message in
refresh_by_match_merge() by an assertion? Just wondering, I would
think that once this patch is applied the existing error message of
refresh_by_match_merge() is just dead code.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-09 Thread Amit Kapila
On Fri, Feb 5, 2016 at 3:17 AM, Robert Haas  wrote:
>
>
> I think we ought to move the buffer mapping, lock manager, and
> predicate lock manager locks into their own tranches also, perhaps
> using this new named-tranche facility.
>

Makes sense and attached patch implements it using new named
tranches facility.  One thing to note is that to make it work on
EXEC_BACKEND builds, I have passed the individual LWLock
array via save-restore backendparams mechanism.  Now instead
we could have initialised the LWLock arrays in each backend at
start-up and I have tried to analyse that way as well, but we need
to use NamedLWLockTrancheArray instead of
NamedLWLockTrancheRequestArray in GetNamedLWLockTranche()
and not only that, we also need to store number of locks
information in NamedLWLockTrancheArray as well.  So it seems
it is better to use save-restore backendparams mechanism for
passing LWLock arrays.


>  In addition, I think we should
> get rid of NumLWLocks().  It's no longer calculating anything; it just
> returns NUM_FIXED_LWLOCKS, and with the changes described in the first
> sentence of this paragraph, it will simply return
> NUM_INDIVIDUAL_LWLOCKS.  We don't need a function just to do that; the
> assignment it does can be moved to the relevant caller.
>

Agreed and we don't even need an array of LWLockCounter's, just one
counter is sufficient for tranches.  Apart from this function
LWLockAssign() seems redundant to me after new implementation
of named tranches, so I have removed that as well.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fixed_locks_tranche_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-09 Thread Michael Paquier
On Wed, Feb 3, 2016 at 7:33 AM, Robert Haas wrote:
> Also, to be frank, I think we ought to be putting more effort into
> another patch in this same area, specifically Thomas Munro's causal
> reads patch.  I think a lot of people today are trying to use
> synchronous replication to build load-balancing clusters and avoid the
> problem where you write some data and then read back stale data from a
> standby server.  Of course, our current synchronous replication
> facilities make no such guarantees - his patch does, and I think
> that's pretty important.  I'm not saying that we shouldn't do this
> too, of course.

Yeah, sure. Each one of those patches is trying to solve a different
problem where Postgres is deficient, here we'd like to be sure a
commit WAL record is correctly flushed on multiple standbys, while the
patch of Thomas is trying to ensure that there is no need to scan for
the replay position of a standby using some GUC parameters and a
validation/sanity layer in syncrep.c to do that. Surely the patch of
this thread has got more attention than Thomas', and both of them have
merits and try to address real problems. FWIW, the patch of Thomas is
a topic that I find rather interesting, and I am planning to look at
it as well, perhaps for next CF or even before that. We'll see how
other things move on.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-02-09 Thread Thom Brown
On 7 January 2016 at 05:24, Amit Kapila  wrote:
> On Fri, Dec 25, 2015 at 6:36 AM, Robert Haas  wrote:
>>
>> On Wed, Dec 23, 2015 at 1:16 AM, Amit Kapila 
>> wrote:
>> >> >
>> >> > Here procArrayGroupXid sounds like Xid at group level, how about
>> >> > procArrayGroupMemberXid?
>> >> > Find the patch with renamed variables for PGProc
>> >> > (rename_pgproc_variables_v1.patch) attached with mail.
>> >>
>> >> I sort of hate to make these member names any longer, but I wonder if
>> >> we should make it procArrayGroupClearXid etc.
>> >
>> > If we go by this suggestion, then the name will look like:
>> > PGProc
>> > {
>> > ..
>> > bool procArrayGroupClearXid, pg_atomic_uint32
>> > procArrayGroupNextClearXid,
>> > TransactionId procArrayGroupLatestXid;
>> > ..
>> >
>> > PROC_HDR
>> > {
>> > ..
>> > pg_atomic_uint32 procArrayGroupFirstClearXid;
>> > ..
>> > }
>> >
>> > I think whatever I sent in last patch were better.  It seems to me it is
>> > better to add some comments before variable names, so that anybody
>> > referring them can understand better and I have added comments in
>> > attached patch rename_pgproc_variables_v2.patch to explain the same.
>>
>> Well, I don't know.  Anybody else have an opinion?
>>
>
> It seems that either people don't have any opinion on this matter or they
> are okay with either of the naming conventions being discussed.  I think
> specifying Member after procArrayGroup can help distinguishing which
> variables are specific to the whole group and which are specific to a
> particular member.  I think that will be helpful for other places as well
> if we use this technique to improve performance.  Let me know what
> you think about the same.
>
> I have verified that previous patches can be applied cleanly and passes
> make check-world.  To avoid confusion, I am attaching the latest
> patches with this mail.

Patches still apply 1 month later.

I don't really have an opinion on the variable naming.  I guess they
only need making longer if there's going to be some confusion about
what they're for, but I'm guessing it's not a blocker here.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-02-09 Thread Michael Paquier
On Tue, Feb 9, 2016 at 2:24 PM, Amit Kapila  wrote:
> Do you see any benefit in allowing checkpoints for such cases considering
> bgwriter will anyway take care of logging standby snapshot for such
> cases?

Well, the idea is to improve the system responsiveness. Imagine that
the call to GetProgressRecPtr() is done within the exclusive lock
portion, but that while scanning the WAL insert lock entries another
backend is waiting to take a lock on them, and this backend is trying
to insert the first record that updates the progress LSN since the
last checkpoint. In this case the checkpoint would be skipped.
However, imagine that such a record is able to get through it while
scanning the progress values in the WAL insert locks, in which case a
checkpoint would be generated. In any case, I think that we should try
to get exclusive lock areas as small as possible if we have ways to do
so.

> I don't think there is any reasonable benefit to improve the situation of
> idle-system check for checkpoint other than just including
> standby snapshot WAL record.  OTOH as checkpoint is not so seldom
> operation, so allowing such a change should be okay, but I don't see
> the need for same.

I am not sure I understand your point here...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-02-09 Thread Amit Kapila
On Tue, Feb 9, 2016 at 6:08 PM, Michael Paquier 
wrote:
>
> On Tue, Feb 9, 2016 at 2:24 PM, Amit Kapila 
wrote:
> > Do you see any benefit in allowing checkpoints for such cases
considering
> > bgwriter will anyway take care of logging standby snapshot for such
> > cases?
>
> Well, the idea is to improve the system responsiveness. Imagine that
> the call to GetProgressRecPtr() is done within the exclusive lock
> portion, but that while scanning the WAL insert lock entries another
> backend is waiting to take a lock on them, and this backend is trying
> to insert the first record that updates the progress LSN since the
> last checkpoint. In this case the checkpoint would be skipped.
> However, imagine that such a record is able to get through it while
> scanning the progress values in the WAL insert locks, in which case a
> checkpoint would be generated.

Such case was not covered without your patch and I don't see the
need of same especially at the cost of taking locks.

>  In any case, I think that we should try
> to get exclusive lock areas as small as possible if we have ways to do
> so.
>

Sure, but not when you are already going to take lock for longer
duration.


- last_snapshot_lsn != GetXLogInsertRecPtr())
+
GetLastCheckpointRecPtr() < GetProgressRecPtr())

How the above check in patch suppose to work?
I think it could so happen once bgwriter logs snapshot, both checkpoint
and progressAt won't get updated any further and I think this check will
allow to log snapshots in such a case as well.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 11:06, Pavel Stehule  wrote:
>   + respect SQL clauses ordering, allows pretty complex ORDER BY clause

That, to me is the key point. SQL already allows very powerful
sorting, so psql should not just throw away the query's sort order and
replace it with something much more basic and limited. The exact
syntax can be debated, but I don't think psql should be doing row
sorting.

I also don't believe that extending the +/- sort syntax to support
more advanced options will be particularly easy, and the result is
likely to be even less readable. It also requires the user to learn
another syntax, when they will already be familiar with SQL's sort
syntax.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-09 Thread Shulgin, Oleksandr
On Mon, Feb 8, 2016 at 5:24 PM, Daniel Verite 
wrote:

> Shulgin, Oleksandr wrote:
>
> > Added to the Open commitfest: https://commitfest.postgresql.org/9/475/
>
> Here's a review. Note that the patch tested and submitted
> is not the initial one in the thread, so it doesn't exactly
> match  $subject now.
>

I've edited the CF entry title to read: Add \errverbose to psql (and
support in libpq)

What's tested here is a client-side approach, suggested by Tom
> upthread as an alternative, and implemented by Oleksandr in
> 0001-POC-errverbose-in-psql.patch
>
> The patch has two parts: one part in libpq exposing a new function
> named PQresultBuildErrorMessage, and a second part implementing an
> \errverbose command in psql, essentially displaying the result of the
> function.
> The separation makes sense if we consider that other clients might benefit
> from the function, or that libpq is a better place than psql to maintain
> this in the future, as the list of error fields available in a PGresult
> might grow.
> OTOH maybe adding a new libpq function just for that is overkill, but this
> is subjective.
>
> psql part
> =
> Compiles and works as intended.
> For instance with \set VERBOSITY default, an FK violation produces:
>
>   # insert into table2 values(10);
>   ERROR:  insert or update on table "table2" violates foreign key
> constraint
> "table2_id_tbl1_fkey"
>   DETAIL:  Key (id_tbl1)=(10) is not present in table "table1".
>
> Then \errverbose just displays the verbose form of the error:
>   # \errverbose
> ERROR:  23503: insert or update on table "table2" violates foreign
>   key constraint "table2_id_tbl1_fkey"
> DETAIL:  Key (id_tbl1)=(10) is not present in table "table1".
> SCHEMA NAME:  public
> TABLE NAME:  table2
> CONSTRAINT NAME:  table2_id_tbl1_fkey
> LOCATION:  ri_ReportViolation, ri_triggers.c:3326
>
> - make check passes
> - valgrind test is OK (no obvious leak after using the command).
>
> Missing bits:
> - the command is not mentioned in the help (\? command, see help.c)
> - it should be added to tab completions (see tab-complete.c)
> - it needs to be described in the SGML documentation
>
> libpq part
> ==
> The patch implements and exports a new PQresultBuildErrorMessage()
> function.
>
> AFAICS its purpose is to produce a result similar to what
> PQresultErrorMessage() would have produced, if PQERRORS_VERBOSE
> was the verbosity in effect at execution time.
>
> My comments:
>
> - the name of the function does not really hint at what it does.
> Maybe something like PQresultVerboseErrorMessage() would be more
> explicit?
>

Not exactly, the verbosity setting might or might not be in effect when
this function is called.  Another external part of the state that might
affect the result is conn->show_context field.

I would expect the new function to have the same interface than
> PQresultErrorMessage(), but it's not the case.
>
> - it takes a PGconn* and PGresult* as input parameters, but
> PQresultErrorMessage takes only a  as input.
> It's not clear why PGconn* is necessary.
>

This is because PQresultErrorMessage() returns a stored error message:
res->errMsg (or "").

- it returns a pointer to a strdup'ed() buffer, which
> has to be freed separately by the caller. That differs from
> PQresultErrorMessage() which keeps this managed inside the
> PGresult struct.
>

For the same reason: this function actually produces a new error message,
as opposed to returning a stored one.

- if protocol version < 3, an error message is returned. It's not
> clear to the caller that this error is emitted by the function itself
> rather than the query. Besides, are we sure it's necessary?
> Maybe the function could just produce an output with whatever
> error fields it has, even minimally with older protocol versions,
> rather than failing?
>

Hm, probably we could just copy the message from conn->errorMessage, in
case of protocol v2, but I don't see a point in passing PGresult to the
func or naming it PQresult in the case of v2.

- if the fixed error message is kept, it should pass through
> libpq_gettext() for translation.
>

Good point.

- a description of the function should be added to the SGML doc.
> There's a sentence in PQsetErrorVerbosity() that says, currently:
>
>   "Changing the verbosity does not affect the messages available from
>already-existing PGresult objects, only subsequently-created ones."
>
> As it's precisely the point of that new function, that bit could
> be altered to refer to it.
>

I didn't touch the documentation specifically, because this is just a
proof-of-concept, but it's good that you notice this detail.  Most
importantly, I'd like to learn of better options than storing the whole
last_result in psql's pset structure.

Thanks for the review!

--
Alex


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/08/2016 10:55 PM, Tom Lane wrote:

Noah Misch  writes:

On Mon, Feb 08, 2016 at 02:15:48PM -0500, Tom Lane wrote:

We've seen variants
on this theme on half a dozen machines just in the past week --- and it
seems to mostly happen in 9.5 and HEAD, which is fishy.

It has been affecting only the four AIX animals, which do share hardware.
(Back in 2015 and once in 2016-01, it did affect axolotl and shearwater.)

Certainly your AIX critters have shown this a bunch, but here's another
current example:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2016-02-08%2014%3A49%3A23


That's reasonable.  If you would like higher-fidelity data, I can run loops of
"pg_ctl -w start; make installcheck; pg_ctl -t900 -w stop", and I could run
that for HEAD and 9.2 simultaneously.  A day of logs from that should show
clearly if HEAD is systematically worse than 9.2.

That sounds like a fine plan, please do it.


So, I wish to raise the timeout for those animals.  Using an environment
variable was a good idea; it's one less thing for test authors to remember.
Since the variable affects a performance-related fudge factor rather than
change behavior per se, I'm less skittish than usual about unintended
consequences of dynamic scope.  (With said unintended consequences in mind, I
made "pg_ctl register" ignore PGCTLTIMEOUT rather than embed its value into
the service created.)

While this isn't necessarily a bad idea in isolation, the current
buildfarm scripts explicitly specify a -t value to pg_ctl stop, which
I would not expect an environment variable to override.  So we need
to fix the buildfarm script to allow the timeout to be configurable.
I'm not sure if there would be any value-add in having pg_ctl answer
to an environment variable once we've done that.



The failure on axolotl was for the ECPG tests, which don't use the 
buildfarm's startup/stop db code. They don't honour TEMP_CONFIG either, 
which they probably should - then we might get better log traces.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> I don't think we should allow sorting colV values client-side,
> overriding a server-side ORDER BY clause in the query.

I shared that opinion until (IIRC) the v8 or v9 of the patch.
Most of the evolution of this patch has been to go
from no client-side sorting option at all, to the full range
of possibilities, ascending or descending, and in both
vertical and horizontal directions.

I agree that colV sorting can be achieved through the
query's ORDER BY, which additionally is more efficient
so it should be the primary choice.

The reason to allow [+/-]colV in \crosstabview is because
I think the average user will expect it, by symmetry with colH.
As the display is reorganized to be like a "grid" instead of a "list
with several columns", we shift the focus to the symmetry
between horizontal and vertical headers, rather than on
the pre-crosstab form of the resultset, even if it's the
same data.
It's easier for the user to just stick a + in front of a column
reference than to figure out that the same result could
be achieved by editing the query and changing/adding
an ORDER BY.

Or said otherwise, having the [+/-] colV sorting is a way to
avoid the question:
"we can sort the horizontal header, so why can't we sort the
vertical header just the same?"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-09 Thread Yury Zhuravlev

On вторник, 29 сентября 2015 г. 19:02:59 MSK, Alvaro Herrera wrote:

Andres Freund wrote:


I went through all headers in src/include and checked for macros
containing [^&]&[^&] and checked whether they have this hazard. Found a
fair number.

That patch also changes !! tests into != 0 style.


I don't think you pushed this, did you?



Hello hackers!
I've just run into a problem with these macro. Function ginStepRight breaks 
completely when compiled using the MSVC2013 and MSVC2015 (since these 
releases use C99's bools but without stdbool.h like C++).
I don't understand why the patch has not been commited yet. It seems to me 
not so important !! or ! = 0, the solution is all that matters.


Thanks!

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:

Note that NULL values in the column that pivots are discarded
by \crosstabview, because NULL as the name of a column does not
make sense.

The doc (in the patch) says:

"The horizontal header, displayed as the first row, contains the set of
all distinct non-null values found in column colH"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Tom Lane
"Daniel Verite"  writes:
>   Dean Rasheed wrote:
>> I don't think we should allow sorting colV values client-side,
>> overriding a server-side ORDER BY clause in the query.

> I shared that opinion until (IIRC) the v8 or v9 of the patch.
> Most of the evolution of this patch has been to go
> from no client-side sorting option at all, to the full range
> of possibilities, ascending or descending, and in both
> vertical and horizontal directions.

I haven't been paying attention to this thread ... but it is sure
sounding like this feature has gotten totally out of hand.  Suggest
reconsidering your design goals.

> Or said otherwise, having the [+/-] colV sorting is a way to
> avoid the question:
> "we can sort the horizontal header, so why can't we sort the
> vertical header just the same?"

I would turn that around, and ask why not remove *both* those things.

I do not think we want any client-side sorting in this feature at all,
because the minute you have any such thing, you are going to have an
absolutely never-ending stream of demands for more sorting features:
multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
I'd rather reject the feature altogether than expect that psql is going
to have to grow all of that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-09 Thread Masahiko Sawada
On Tue, Feb 9, 2016 at 10:32 PM, Michael Paquier
 wrote:
> On Wed, Feb 3, 2016 at 7:33 AM, Robert Haas wrote:
>> Also, to be frank, I think we ought to be putting more effort into
>> another patch in this same area, specifically Thomas Munro's causal
>> reads patch.  I think a lot of people today are trying to use
>> synchronous replication to build load-balancing clusters and avoid the
>> problem where you write some data and then read back stale data from a
>> standby server.  Of course, our current synchronous replication
>> facilities make no such guarantees - his patch does, and I think
>> that's pretty important.  I'm not saying that we shouldn't do this
>> too, of course.
>
> Yeah, sure. Each one of those patches is trying to solve a different
> problem where Postgres is deficient, here we'd like to be sure a
> commit WAL record is correctly flushed on multiple standbys, while the
> patch of Thomas is trying to ensure that there is no need to scan for
> the replay position of a standby using some GUC parameters and a
> validation/sanity layer in syncrep.c to do that. Surely the patch of
> this thread has got more attention than Thomas', and both of them have
> merits and try to address real problems. FWIW, the patch of Thomas is
> a topic that I find rather interesting, and I am planning to look at
> it as well, perhaps for next CF or even before that. We'll see how
> other things move on.

Attached first version dedicated language patch (document patch is not yet.)

This patch supports only 1-nest priority method, but this feature will
be expanded with adding quorum method or > 1 level nesting.
So this patch are implemented while being considered about its extensibility.
And I've implemented the new system view we discussed on this thread
but that feature is not included in this patch (because it's not
necessary yet now)

== Syntax ==
s_s_names can have two type syntaxes like follows,

1. s_s_names = 'node1, node2, node3'
2. s_s_names = '2[node1, node2, node3]'

#1 syntax is for backward compatibility, which implies the master
server wait for only 1 server.
#2 syntax is new syntax using dedicated language.

In above #2 setting, node1 standby has lowest priority and node3
standby has highest priority.
And master server will wait for COMMIT until at least 2 lowest
priority standbys send ACK to master.

== Memory Structure ==
Previously, master server has value of s_s_names as string, and used
it when master server determine standby priority.
This patch changed it so that master server has new memory structure
(called SyncGroupNode) in order to be able to handle multiple (and
nested in the future) standby nodes flexibly.
All information of SyncGroupNode are set during parsing s_s_names.

The memory structure is,

structSyncGroupNode
{
   /* Common information */
   inttype;
   char*name;
   SyncGroupNode*next; /* same group next name node */

   /* For group ndoe */
   int sync_method; /* priority */
   intwait_num;
   SyncGroupNode*member; /* member of its group */
   bool (*SyncRepGetSyncedLsnsFn) (SyncGroupNode *group, XLogRecPtr *write_pos,
   XLogRecPtr *flush_pos);
   int (*SyncRepGetSyncStandbysFn) (SyncGroupNode *group, int *list);
};

SyncGroupNode can be different two types; name node, group node, and
have pointer to another name/group node in same group and list of
group members.
name node represents a synchronous standby.
group node represents a group of some name nodes, which can have list
of group member, and synchronous method, number of waiting node.
The list of members are linked with one-way list, and are located in
s_s_names definition order.
e.g. in case of above #2 setting, member list could be,

"main".member -> "node1".next -> "node2".next -> "node3".next -> NULL

The most top level node is always "main" group node. i.g., in this
version patch, only 1 group ("main" group) is created which has some
name nodes (not group node).
And group node has two functions pointer;

* SyncRepGetSyncedLsnsFn
This function decides group write/flush LSNs at that moment.
For example in case of priority method, the lowest LSNs of standbys
that are considered as synchronous should be selected.
If there are not synchronous standbys enough to decide LSNs then this
function return false.

* SyncRepGetSyncStandbysFn :
This function obtains array of walsnd positions of its standby members
that are considered as synchronous.

This implementation might not good in some reason, so please give me feedbacks.
And I will create new commitfest entry for this patch to CF5.

Regards,

--
Masahiko Sawada
diff --git a/src/backend/Makefile b/src/backend/Makefile
index b3d5e2e..3e36686 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -203,7 +203,7 @@ distprep:
 	$(MAKE) -C parser	gram.c gram.h scan.c
 	$(MAKE) -C bootstrap	bootparse.c bootscanner.c
 	$(MAKE) -C catalog	schemapg.h postgres.bki postgres.description 

Re: [HACKERS] Existence check for suitable index in advance when concurrently refreshing.

2016-02-09 Thread Fujii Masao
On Tue, Feb 9, 2016 at 9:11 PM, Michael Paquier
 wrote:
> On Tue, Feb 9, 2016 at 4:27 PM, Fujii Masao  wrote:
>> Thanks for updating the patch!
>> Attached is the updated version of the patch.
>> I removed unnecessary assertion check and change of source code
>> that you added, and improved the source comment.
>> Barring objection, I'll commit this patch.
>
> So, this code basically duplicates what is already in
> refresh_by_match_merge to check if there is a UNIQUE index defined. If
> we are sure that an error is detected earlier in the code as done in
> this patch, wouldn't it be better to replace the error message in
> refresh_by_match_merge() by an assertion?

I'm OK with an assertion if we add source comment about why
refresh_by_match_merge() can always guarantee that there is
a unique index on the matview. Probably it's because the matview
is locked with exclusive lock at the start of ExecRefreshMatView(),
i.e., it's guaranteed that we cannot drop any indexes on the matview
after the first check is passed. Also it might be better to add
another comment about that the caller of refresh_by_match_merge()
must always check that there is a unique index on the matview before
calling that function, just in the case where it's called elsewhere
in the future.

OTOH, I don't think it's not so bad idea to just emit an error, instead.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-02-09 Thread Tom Lane
Artur Zakirov  writes:
>> I think the NIImportOOAffixes() in spell.c should be corrected to avoid
>> this bug.

> I have attached a patch. It adds new functions parse_ooaffentry() and 
> get_nextentry() and fixes a couple comments.

I do not like this patch much.  It is basically "let's stop using sscanf()
because it seems to have a bug on one platform".  There are at least two
things wrong with that approach:

1. By my count there are about 80 uses of *scanf() in our code.  Are we
going to replace every one of them with hand-rolled code?  If not, why
is only this instance vulnerable?  How can we know whether future uses
will have a problem?

2. We're not being very good citizens of the software universe if we
just install a hack in Postgres rather than nagging Apple to fix the
bug at its true source.

I think the appropriate next step to take is to dig into the OS X
sources (see http://www.opensource.apple.com, I think probably the
relevant code is in the Libc package) and identify exactly what is
causing the misbehavior.  That would both allow an informed answer
to point #1 and greatly increase the odds of getting action on a
bug report to Apple.  Even if we end up applying this patch verbatim,
I think we need that information first.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Existence check for suitable index in advance when concurrently refreshing.

2016-02-09 Thread Fujii Masao
On Wed, Feb 10, 2016 at 2:21 AM, Fujii Masao  wrote:
> On Tue, Feb 9, 2016 at 9:11 PM, Michael Paquier
>  wrote:
>> On Tue, Feb 9, 2016 at 4:27 PM, Fujii Masao  wrote:
>>> Thanks for updating the patch!
>>> Attached is the updated version of the patch.
>>> I removed unnecessary assertion check and change of source code
>>> that you added, and improved the source comment.
>>> Barring objection, I'll commit this patch.
>>
>> So, this code basically duplicates what is already in
>> refresh_by_match_merge to check if there is a UNIQUE index defined. If
>> we are sure that an error is detected earlier in the code as done in
>> this patch, wouldn't it be better to replace the error message in
>> refresh_by_match_merge() by an assertion?
>
> I'm OK with an assertion if we add source comment about why
> refresh_by_match_merge() can always guarantee that there is
> a unique index on the matview. Probably it's because the matview
> is locked with exclusive lock at the start of ExecRefreshMatView(),
> i.e., it's guaranteed that we cannot drop any indexes on the matview
> after the first check is passed. Also it might be better to add
> another comment about that the caller of refresh_by_match_merge()
> must always check that there is a unique index on the matview before
> calling that function, just in the case where it's called elsewhere
> in the future.
>
> OTOH, I don't think it's not so bad idea to just emit an error, instead.

Sorry, s/it's not/it's

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: function parse_ident

2016-02-09 Thread Pavel Stehule
Hi

2016-02-08 16:55 GMT+01:00 Teodor Sigaev :

> rebased, messages changes per Tom's proposal
>>
> Cool feature and sometimes I needed it a lot.
>
> But, seems, there are some bugs in error processing.
>
> 1
> Following query is okay:
> # select * from parse_ident(E'"Some \r Schema".someTable');
>  parse_ident
> --
>  {"Some \r Schema",sometable}
> but:
> % select * from parse_ident(E'"Some \r Schema".9someTable');
>  Schema".9someTable"tifier after "." symbol: ""Some
>
> Return carriage is not escaped in error message. Suppose, any other
> special charaters will not be escaped.
>
> 2
> # select * from parse_ident('.someTable');
> ERROR:  missing identifier after "." symbol: ".someTable"
> Why AFTER  instead of BEFORE?
>

fixed - now the function produce more adequate message - see regress tests


>
> 2
> Function succesfully truncates long indentifier but not in case of quoted
> identifier.
> select length(a[1]), length(a[2]) from
> parse_ident('"xx".y')
> as a ;
>  length | length
> +
> 414 | 63
>
>
>
fixed - I used the function downcase_truncate_identifier, that does
truncating. I agree - in this case default truncating isn't practical - and
user can explicitly truncate (or implicitly by casting to "name")

New patch attached

Thank you for test

Regards

Pavel



>
>
>
>
>
>
> --
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>WWW:
> http://www.sigaev.ru/
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
I haven't been paying attention to this thread ... but it is sure
> sounding like this feature has gotten totally out of hand.  Suggest
> reconsidering your design goals.
>
> > Or said otherwise, having the [+/-] colV sorting is a way to
> > avoid the question:
> > "we can sort the horizontal header, so why can't we sort the
> > vertical header just the same?"
>
> I would turn that around, and ask why not remove *both* those things.
>
> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
> I'd rather reject the feature altogether than expect that psql is going
> to have to grow all of that.
>

I am thinking so without possibility to sort data on client side, this
feature will be significantly limited. You cannot do server side sort for
both dimensions. Working with 2d report when one dimension is unsorted is
not friendly.

But the client side sorting can be limited to number's or C locale sorting.
I don't think so full sort possibilities are necessary.

Regards

Pavel


> regards, tom lane
>


Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?

2016-02-09 Thread Robert Haas
On Sun, Feb 7, 2016 at 9:49 PM, Kouhei Kaigai  wrote:
> On the other hands, it also became clear we have to guarantee OS buffer
> or storage block must not be updated partially during the P2P DMA.
> My motivation is a potential utilization of P2P DMA of SSD-to-GPU to
> filter out unnecessary rows and columns prior to loading to CPU/RAM.
> It needs to ensure PostgreSQL does not write out buffers to OS buffers
> to avoid unexpected data corruption.
>
> What I want to achieve is suspend of buffer write towards a particular
> (relnode, forknum, blocknum) pair for a short time, by completion of
> data processing by GPU (or other external devices).
> In addition, it is preferable being workable regardless of the choice
> of storage manager, even if we may have multiple options on top of the
> pluggable smgr in the future.

It seems like you just need to take an exclusive content lock on the
buffer, or maybe a shared content lock would be sufficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 03:05 PM, Tom Lane wrote:

I wrote:

In any case, we should proceed with fixing things so that buildfarm owners
can specify a higher shutdown timeout for especially slow critters.

I looked into doing this as I suggested yesterday, namely modifying the
buildfarm scripts, and soon decided that it would be a mess; there are
too many cases where "pg_ctl stop" is not invoked directly by the script.

I'm now in favor of applying the PGCTLTIMEOUT patch Noah proposed, and
*removing* the two existing hacks in run_build.pl that try to force -t 120.

The only real argument I can see against that approach is that we'd have
to back-patch the PGCTLTIMEOUT patch to all active branches if we want
to stop the buildfarm failures.  We don't usually back-patch feature
additions.  On the other hand, this wouldn't be the first time we've
back-patched something on grounds of helping the buildfarm, so I find
that argument pretty weak.






OK. I can put out a new release as required.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 19:47, Robert Haas  wrote:
> I think you're dismissing Tom's concerns far too lightly.  The
> row_security=off mode, which is the default, becomes unusable for
> non-superusers under this proposal.  That's bad.  And if you switch to
> the other mode, then you might accidentally fail to get all of the
> data in some table you're trying to back up.  That's bad too: that's
> why it isn't the default.  There's a big difference between saying
> "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> dumping all of the data in some table I *can* see".
>
> It seems to me that there's a big difference between policies we ship
> out of the box and policies that are created be users: specifically,
> the former can be assumed benign, while the latter can't.  I think
> that difference matters here, although I'm not sure exactly where to
> go with it.
>

It sounds like there needs to be a separate system_row_security
setting that controls RLS on the system catalogs, and that it should
be on by default in pg_dump.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 3:26 PM, Stephen Frost  wrote:
> Arbitrary code execution is quite a different concern from the prior
> concern regarding incomplete dumps.

I've had both concerns all along, and I think I've mentioned them before.

> To the extent that untrusted code execution is an issue (and my
> experience with environments which would deploy RLS tells me that it
> isn't a practical concern), an option could be created which would cause
> an error to be thrown on non-catalog RLS being run.

There's a major release already in the wild that doesn't behave that
way.  And anyway I think that's missing the point: it's true that
features that are turned off don't cause problems, but features that
are turned on shouldn't break things either.

> When it comes to multi-tenancy environments, as this thread is about,
> chances are the only tables you can see are ones which you own or are
> owned by a trusted user, which is why I don't view this as a pratical
> concern, but I'm not against having a solution to address the issue
> raised regarding arbitrary code execution, provided it doesn't create
> more problems than it purports to solve.

Well, I'm against accepting this patch without such a solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Feb 9, 2016 at 3:26 PM, Stephen Frost  wrote:
> > To the extent that untrusted code execution is an issue (and my
> > experience with environments which would deploy RLS tells me that it
> > isn't a practical concern), an option could be created which would cause
> > an error to be thrown on non-catalog RLS being run.
> 
> There's a major release already in the wild that doesn't behave that
> way.

I'm at a loss as to what you're getting at there.  We don't have any
catalog RLS, and when it comes to non-catalog RLS, we do have an option
to throw an error when it's going to be run (and it's the default, as
you pointed out), in the one major version which supports RLS.

> And anyway I think that's missing the point: it's true that
> features that are turned off don't cause problems, but features that
> are turned on shouldn't break things either.

I don't, generally, disagree with that statement, but we have to agree
on what's on vs. off and what is broken vs. working correctly.  See
nearby comments from JD about how non-superuser pg_dump could be seen as
broken when running against an environment where RLS is in use.

> > When it comes to multi-tenancy environments, as this thread is about,
> > chances are the only tables you can see are ones which you own or are
> > owned by a trusted user, which is why I don't view this as a pratical
> > concern, but I'm not against having a solution to address the issue
> > raised regarding arbitrary code execution, provided it doesn't create
> > more problems than it purports to solve.
> 
> Well, I'm against accepting this patch without such a solution.

That's at least something which can be built upon then to help this
progress.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Joshua D. Drake

On 02/09/2016 12:28 PM, Stephen Frost wrote:

JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:

pg_dump -U $non-super_user

Should just work, period.


That ship has sailed already, where you're running a pg_dump against
objects you don't own and which have RLS enabled on them.


Just to be clear, what I was suggesting is that pg_dump would just work 
(and RLS would properly execute) or in other words, I shouldn't have to 
tell pg_dump to enable row security else throw an error. If RLS is 
enabled, then the backup just runs with appropriate permissions.


Or am I missing something?

Sincerely,

JD




Thanks!

Stephen




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Joe Conway
On 02/09/2016 01:22 PM, Tom Lane wrote:
> Maybe we need to restrict that somehow, or maybe some better solution
> exists that we've not thought of yet.  But in its current state, RLS
> is at least as much a security hazard as it is a security aid.
> I do not want to see it extended in ways that make pg_dump unsafe to
> use.

Ok, I can see that. Maybe we should have a specific GRANT for CREATE
POLICY which is distinct from the privilege to CREATE TABLE?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


[HACKERS] Convert pltcl from strings to objects

2016-02-09 Thread Jim Nasby
Currently, pl/tcl is implemented through nothing but string 
manipulation. In other words, the C code is effectively creating a giant 
string that the tcl interpreter must re-parse every time the function is 
executed. Additionally, all arguments are treated as raw strings, 
instead of the far more efficient internal tcl object types.


The biggest win comes with how pltcl interfaces with SPI result sets. 
Currently, the entire chunk of tcl code that is executed for each result 
row must be reparsed and recompiled from scratch. Now, the code is 
compiled once and the bytecode is stored.


This work is sponsored by Flight Aware (http://flightaware.com/).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/pl/tcl/pltcl.c b/src/pl/tcl/pltcl.c
index dce5d04..fdb6f9e 100644
--- a/src/pl/tcl/pltcl.c
+++ b/src/pl/tcl/pltcl.c
@@ -212,33 +212,33 @@ static pltcl_proc_desc *compile_pltcl_function(Oid 
fn_oid, Oid tgreloid,
   bool pltrusted);
 
 static int pltcl_elog(ClientData cdata, Tcl_Interp *interp,
-  int argc, CONST84 char *argv[]);
+  int objc, Tcl_Obj * const objv[]);
 static int pltcl_quote(ClientData cdata, Tcl_Interp *interp,
-   int argc, CONST84 char *argv[]);
+   int objc, Tcl_Obj * const objv[]);
 static int pltcl_argisnull(ClientData cdata, Tcl_Interp *interp,
-   int argc, CONST84 char *argv[]);
+   int objc, Tcl_Obj * const objv[]);
 static int pltcl_returnnull(ClientData cdata, Tcl_Interp *interp,
-int argc, CONST84 char *argv[]);
+int objc, Tcl_Obj * const objv[]);
 
 static int pltcl_SPI_execute(ClientData cdata, Tcl_Interp *interp,
- int argc, CONST84 char *argv[]);
+ int objc, Tcl_Obj * const objv[]);
 static int pltcl_process_SPI_result(Tcl_Interp *interp,
 CONST84 char *arrayname,
-CONST84 char *loop_body,
+Tcl_Obj * loop_body,
 int spi_rc,
 SPITupleTable *tuptable,
 int ntuples);
 static int pltcl_SPI_prepare(ClientData cdata, Tcl_Interp *interp,
- int argc, CONST84 char *argv[]);
+ int objc, Tcl_Obj * const objv[]);
 static int pltcl_SPI_execute_plan(ClientData cdata, Tcl_Interp *interp,
-  int argc, CONST84 char *argv[]);
+  int objc, Tcl_Obj * const objv[]);
 static int pltcl_SPI_lastoid(ClientData cdata, Tcl_Interp *interp,
- int argc, CONST84 char *argv[]);
+ int objc, Tcl_Obj * const objv[]);
 
 static void pltcl_set_tuple_values(Tcl_Interp *interp, CONST84 char *arrayname,
   int tupno, HeapTuple tuple, 
TupleDesc tupdesc);
 static void pltcl_build_tuple_argument(HeapTuple tuple, TupleDesc tupdesc,
-  Tcl_DString *retval);
+  Tcl_Obj * retobj);
 
 
 /*
@@ -425,23 +425,23 @@ pltcl_init_interp(pltcl_interp_desc *interp_desc, bool 
pltrusted)
/
 * Install the commands for SPI support in the interpreter
 /
-   Tcl_CreateCommand(interp, "elog",
- pltcl_elog, NULL, NULL);
-   Tcl_CreateCommand(interp, "quote",
- pltcl_quote, NULL, NULL);
-   Tcl_CreateCommand(interp, "argisnull",
- pltcl_argisnull, NULL, NULL);
-   Tcl_CreateCommand(interp, "return_null",
- pltcl_returnnull, NULL, NULL);
-
-   Tcl_CreateCommand(interp, "spi_exec",
- pltcl_SPI_execute, NULL, NULL);
-   Tcl_CreateCommand(interp, "spi_prepare",
- pltcl_SPI_prepare, NULL, NULL);
-   Tcl_CreateCommand(interp, "spi_execp",
- pltcl_SPI_execute_plan, NULL, NULL);
-   Tcl_CreateCommand(interp, "spi_lastoid",
- pltcl_SPI_lastoid, NULL, NULL);
+   Tcl_CreateObjCommand(interp, "elog",
+pltcl_elog, NULL, NULL);
+   Tcl_CreateObjCommand(interp, 

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 9 February 2016 at 19:47, Robert Haas  wrote:
> > I think you're dismissing Tom's concerns far too lightly.  The
> > row_security=off mode, which is the default, becomes unusable for
> > non-superusers under this proposal.  That's bad.  And if you switch to
> > the other mode, then you might accidentally fail to get all of the
> > data in some table you're trying to back up.  That's bad too: that's
> > why it isn't the default.  There's a big difference between saying
> > "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> > dumping all of the data in some table I *can* see".
> >
> > It seems to me that there's a big difference between policies we ship
> > out of the box and policies that are created be users: specifically,
> > the former can be assumed benign, while the latter can't.  I think
> > that difference matters here, although I'm not sure exactly where to
> > go with it.
> 
> It sounds like there needs to be a separate system_row_security
> setting that controls RLS on the system catalogs, and that it should
> be on by default in pg_dump.

Right, that's what I had been thinking also.

Thanks (and congrats, btw!),

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Joe Conway
On 02/09/2016 12:47 PM, Robert Haas wrote:
> On Tue, Feb 9, 2016 at 3:28 PM, Stephen Frost  wrote:
>> JD,
>>
>> * Joshua D. Drake (j...@commandprompt.com) wrote:
>>> pg_dump -U $non-super_user
>>>
>>> Should just work, period.
>>
>> That ship has sailed already, where you're running a pg_dump against
>> objects you don't own and which have RLS enabled on them.
> 
> But you'll get an error rather than an incomplete dump, and you won't
> run some code that you didn't want to run.  Those distinctions matter.

From the perspective of that unprivileged user, the dump is not
incomplete -- it is exactly as complete as it is supposed to be.

Personally I don't buy that the current situation is a good thing. I
know that the "ship has sailed" and regret not having participated in
the earlier discussions, but I agree with JD here -- the unprivileged
user should not have to even think about whether RLS exists, they should
only see what they have been allowed to see by the privileged users (and
in the context of their own objects, owners are privileged). I don't
think an unprivileged user should get to decide what code runs in order
to make that happen.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 4:22 PM, Tom Lane  wrote:
> Part of the problem here is that we have *not* created any hard and fast
> distinction between "privileged" and "unprivileged" users; I think that
> even speaking in those terms about RLS risks errors in your thinking.

+1.

> In particular, the code-execution issue arises from the fact that a table
> owner can now cause code to execute *with the permissions of someone else*
> if the someone else is foolish enough to select from his table.  No
> special privileges required, just the ability to create a table.  If we
> make pg_dump run with RLS enabled, then the "foolish" part doesn't need to
> be any more foolish than forgetting a -t switch when using pg_dump.

Yes.  That is exactly why I argued for the current situation to be the
way it is, and I think it would have been a huge mistake if we now
decided otherwise.  I don't have a ton of confidence that the database
is free of problems that would allow one user to assume the privileges
of another - but I certainly don't want to design more such problems
into the server.

> Maybe we need to restrict that somehow, or maybe some better solution
> exists that we've not thought of yet.  But in its current state, RLS
> is at least as much a security hazard as it is a security aid.
> I do not want to see it extended in ways that make pg_dump unsafe to
> use.

I could not agree more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Jim Nasby

On 2/9/16 4:13 PM, Corey Huinker wrote:


We're not going to get source compatibility without implementing
packages, and there's no enthusiasm for that. It's been stated a few
times before by some that the only value they see in packages is the
package/session variables. Pavel's idea gives us that.


The other big thing you get is public vs private. You can 
sorta-kinda-almost simulate that with permissions in simple cases, but 
it ultimately falls apart as soon as you want a private function that 
does something as the user calling the function.


When it comes to variables, I think it's a mistake to discuss this patch 
while pretending that packages don't exist. For example all we wanted 
were session variables, there's no reason they need to be tied to 
schemas. The only reason to tie them to schemas is to try and fake 
package support via schemas. I think it'd be a mistake to have 
non-schema variables, but lets not fool ourselves as to why that would 
be a mistake.


Another problem I have with this is it completely ignores public/private 
session variables. The current claim is that's not a big deal because 
you can only access the variables from a PL, but I give it 2 days of 
this being released before people are asking for a way to access the 
variables directly from SQL. Now you have a problem because if you want 
private variables (which I think is pretty important) you're only choice 
is to use SECDEF functions, which is awkward at best.



I forgot to mention that if we're FROM-phobic the syntax could also be

IMPORT my_schema.bar AS g_localtext IN OUT text

Either way, you get the idea: the function defines what external globals
it's willing to see, and gives an alias for them, and it's the same
regardless of what the function language is.


ISTM that for plpgsql it would be better to add a namespace level above 
the current top level (which is the function level).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Robert Haas
On Tue, Feb 9, 2016 at 3:28 PM, Stephen Frost  wrote:
> JD,
>
> * Joshua D. Drake (j...@commandprompt.com) wrote:
>> pg_dump -U $non-super_user
>>
>> Should just work, period.
>
> That ship has sailed already, where you're running a pg_dump against
> objects you don't own and which have RLS enabled on them.

But you'll get an error rather than an incomplete dump, and you won't
run some code that you didn't want to run.  Those distinctions matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
> On 02/09/2016 12:28 PM, Stephen Frost wrote:
> >* Joshua D. Drake (j...@commandprompt.com) wrote:
> >>pg_dump -U $non-super_user
> >>
> >>Should just work, period.
> >
> >That ship has sailed already, where you're running a pg_dump against
> >objects you don't own and which have RLS enabled on them.
> 
> Just to be clear, what I was suggesting is that pg_dump would just
> work (and RLS would properly execute) or in other words, I shouldn't
> have to tell pg_dump to enable row security else throw an error. If
> RLS is enabled, then the backup just runs with appropriate
> permissions.
> 
> Or am I missing something?

You do have to tell pg_dump to enable RLS if you want it to be enabled
when performing a pg_dump.  There's multiple reasons for this, the first
being that, otherwise, you might get an incomplete dump, and secondly,
you might execute a function that some untrusted user wrote and included
in their RLS policy.  We want to avoid both of those, unless you've
specifically asked for it to be done.  That's why row_security is set to
'off' by pg_dump by default.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Tom Lane
Joe Conway  writes:
> Personally I don't buy that the current situation is a good thing. I
> know that the "ship has sailed" and regret not having participated in
> the earlier discussions, but I agree with JD here -- the unprivileged
> user should not have to even think about whether RLS exists, they should
> only see what they have been allowed to see by the privileged users (and
> in the context of their own objects, owners are privileged). I don't
> think an unprivileged user should get to decide what code runs in order
> to make that happen.

Part of the problem here is that we have *not* created any hard and fast
distinction between "privileged" and "unprivileged" users; I think that
even speaking in those terms about RLS risks errors in your thinking.

In particular, the code-execution issue arises from the fact that a table
owner can now cause code to execute *with the permissions of someone else*
if the someone else is foolish enough to select from his table.  No
special privileges required, just the ability to create a table.  If we
make pg_dump run with RLS enabled, then the "foolish" part doesn't need to
be any more foolish than forgetting a -t switch when using pg_dump.

Maybe we need to restrict that somehow, or maybe some better solution
exists that we've not thought of yet.  But in its current state, RLS
is at least as much a security hazard as it is a security aid.
I do not want to see it extended in ways that make pg_dump unsafe to
use.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Joe Conway  writes:
> > Personally I don't buy that the current situation is a good thing. I
> > know that the "ship has sailed" and regret not having participated in
> > the earlier discussions, but I agree with JD here -- the unprivileged
> > user should not have to even think about whether RLS exists, they should
> > only see what they have been allowed to see by the privileged users (and
> > in the context of their own objects, owners are privileged). I don't
> > think an unprivileged user should get to decide what code runs in order
> > to make that happen.
> 
> Part of the problem here is that we have *not* created any hard and fast
> distinction between "privileged" and "unprivileged" users; I think that
> even speaking in those terms about RLS risks errors in your thinking.

I agree that where, exactly, that line is drawn is part of the issue,
and where's it's drawn is really system-dependent.  In many
environments, I view Joe's comments as entirely accurate- only
privileged users are allowed to create objects *at all*.  Of course,
there are a lot of environments where everyone is allowed to create
objects and, in those environments, all those users would be viewed as
"unprivileged", generally speaking.

> In particular, the code-execution issue arises from the fact that a table
> owner can now cause code to execute *with the permissions of someone else*
> if the someone else is foolish enough to select from his table.  No
> special privileges required, just the ability to create a table.  If we
> make pg_dump run with RLS enabled, then the "foolish" part doesn't need to
> be any more foolish than forgetting a -t switch when using pg_dump.

That distinction is really only relevant when it comes to pg_dump, as
those same users could use views to cause their code to be executed by
other users who are selecting from their view, and they could change if
it's a table or a view quite easily.  From a practical standpoint, we're
making a huge distinction between our client tools- pg_dump must be
protected from X, but we don't have any such qualms or concerns
regarding queries sent from psql.

Perhaps that's the right distinction to make, or perhaps we should come
up with a better answer for psql than what we have now, but I don't
agree that RLS is seriously moving the goalposts, overall, here,
particularly since you're not going to have any RLS policies being
executed by pg_dump when run as a superuser anyway, given Noah's change
to how BYPASSRLS works.

> Maybe we need to restrict that somehow, or maybe some better solution
> exists that we've not thought of yet.  But in its current state, RLS
> is at least as much a security hazard as it is a security aid.
> I do not want to see it extended in ways that make pg_dump unsafe to
> use.

I'm not against coming up with an approach which restricts cases where
user A can write code that will be run under another user's rights,
provided it doesn't make the system overly painful to use.  I don't see
RLS as changing the security risks all that much when you're talking
about regular user queries through psql, and the concern regarding
pg_dump has been addressed through the default of row_security being
off.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost
* Joe Conway (m...@joeconway.com) wrote:
> On 02/09/2016 01:22 PM, Tom Lane wrote:
> > Maybe we need to restrict that somehow, or maybe some better solution
> > exists that we've not thought of yet.  But in its current state, RLS
> > is at least as much a security hazard as it is a security aid.
> > I do not want to see it extended in ways that make pg_dump unsafe to
> > use.
> 
> Ok, I can see that. Maybe we should have a specific GRANT for CREATE
> POLICY which is distinct from the privilege to CREATE TABLE?

Well, the only privilege we have now is "CREATE", which allows creation
of any kind of object inside a schema.  I'm generally in favor of
providing more granluar 'create table', 'create view', etc privileges
that can be granted out at the schema level, and 'create policy' would
be appropriate to include in such a set of object-creation permissions.

I don't have any particularly genius ideas about where we'd get the bits
to implement such a grant system though.  We could modify the existing
grant system to use larger bits, but given that this would only be
applicable for schemas, perhaps it'd make sense to have another field
in pg_namespace instead?  Not sure, just brainstorming here.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Corey Huinker
On Tue, Feb 9, 2016 at 2:55 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Feb 9, 2016 at 11:32 AM, Corey Huinker 
> wrote:
>
>>
>> Oh, and I suggest we call them SESSION variables rather than SCHEMA
>> variables, to reinforce the idea of how long the values in the variables
>> live. A session variable is in a sense a 1x1 temp table, whose definition
>> persists across sessions but whose value does not.
>>
>> Of course, if they do persist across sessions, then yeah, SCHEMA makes
>> more sense. But every package variable in Oracle PL/SQL was initialized
>> when the package was first loaded into the session.
>>
>>
> ​The key distinction for SCHEMA was that all functions in the schema would
> be able to see them (and only those in the schema).
>
> I am a bit partial, with little deep thought, to the IMPORT mechanic.
> Changing them to actual session variables would be doable and you could
> allow for the IMPORT specification to use search_path or explicit means to
> locate said variables regardless of which schema​
>
> ​they exist in.
>
> However, part of the goal is to blend into the broader database community
> and thus increase porting capabilities.  I'm not sure how well this would
> help fulfill that goal.
>
>
We're not going to get source compatibility without implementing packages,
and there's no enthusiasm for that. It's been stated a few times before by
some that the only value they see in packages is the package/session
variables. Pavel's idea gives us that.

I forgot to mention that if we're FROM-phobic the syntax could also be

IMPORT my_schema.bar AS g_localtext IN OUT text

Either way, you get the idea: the function defines what external globals
it's willing to see, and gives an alias for them, and it's the same
regardless of what the function language is.


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Jim Nasby

On 2/8/16 10:02 AM, Pavel Stehule wrote:


I think it would make sense to implement the interface in at least
one of our other supported PLs. I'm not entirely clear how well this
will match up with, say, plperl, but I'd be interested to see.


The minimalistic interface can be based on get/set functions. We can do
necessary transformations there.


get/set functions where?

I don't think that really makes sense. I would expect schema variables 
to be exposed to a function as variables or attributes, either in the 
global namespace for that PL, or as an attribute of some object (ie the 
plpy object in plpython).


I certainly wouldn't expect this patch to do that for all existing PLs, 
but I think it's important to do it for one PL besides plpgsql to make 
sure there's no gotchas.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-02-09 Thread Filip Rembiałkowski
On Tue, Feb 9, 2016 at 12:15 AM, Merlin Moncure  wrote:

> I wonder if the third argument
> should be a boolean however.  If we make it 'text, 'send mode',
> instead, we could leave some room for more specialization of the
> queuing behavior.
>
> For example, we've had a couple of requests over the years to have an
> 'immediate' mode which dumps the notification immediately to the
> client without waiting for tx commit. This may or may not be a good
> idea, but if it was ultimately proved to be, it could be introduced as
> an alternate mode without adding an extra function.

But then it becomes disputable if SQL syntax change makes sense.

---we had this,
 NOTIFY channel [ , payload ]
---and in this patch we have this
NOTIFY [ ALL | DISTINCT ] channel [ , payload ]
 ---  but maybe we should have this?
NOTIFY channel [ , payload [ , mode ] ]

I'm not sure which direction is better with non-standard SQL additions.
Recycling keywords or adding more commas?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Joe Conway
On 02/09/2016 11:47 AM, Robert Haas wrote:
> On Fri, Jan 15, 2016 at 11:53 AM, Stephen Frost  wrote:
>>> Whereupon you'd have no certainty that what you got represented a
>>> complete dump of your own data.
>>
>> It would be a dump of what you're allowed to see, rather than an error
>> saying you couldn't dump something you couldn't see, which is the
>> alternative we're talking about here.  Even if you've got a dependency
>> to something-or-other, if you don't have access to it, then you're
>> going to get an error.
> 
> I think you're dismissing Tom's concerns far too lightly.  The
> row_security=off mode, which is the default, becomes unusable for
> non-superusers under this proposal.  That's bad.  And if you switch to
> the other mode, then you might accidentally fail to get all of the
> data in some table you're trying to back up.  That's bad too: that's
> why it isn't the default.  There's a big difference between saying
> "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> dumping all of the data in some table I *can* see".

I don't grok what you're saying here. If I, as a non-superuser could
somehow see all the rows in a table just by running pg_dump, including
rows that I could not normally see due to RLS policies, *that* would be
bad. I should have no expectation of being able to dump rows I can't
normally see.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Bug in StartupSUBTRANS

2016-02-09 Thread Simon Riggs
On 9 February 2016 at 18:42, Jeff Janes  wrote:

> While testing the crash resilience of the recent 2-part-commit
> improvements, I've run into a problem where sometimes after a crash
> the recovery process creates zeroed files in pg_subtrans until it
> exhausts all disk space.
>

Not sure which patch you're talking about there (2-part-commit).


> Looking at the code, it looks like it does not anticipate that the xid
> might wrap around, meaning startPage/endPage might also wrap around.
> But obviously should not do so at int_max but rather at some much
> smaller other value.
>

Hmm, looks like the != part attempted to wrap, but just didn't get it right.

Your patch looks right to me, so I will commit, barring objections... with
backpatch. Likely to 9.0, AFAICS.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] pl/pgSQL, get diagnostics and big data

2016-02-09 Thread Christian Ullrich

* Andreas 'ads' Scherbaum wrote:


one of our customers approached us and complained, that GET DIAGNOSTICS
row_count returns invalid results if the number of rows is > 2^31. It's



Attached patch expands the row_count to 64 bit.

diagnostics=# select testfunc_pg((2^32 + 5)::bigint);
  testfunc_pg
-
   4295017296
(1 row)


This is my first patch review, but I have to get my feet wet at some 
point, and this is a nice, small patch to do that.


Following the checklist from the wiki:

- Is the patch in context format: Yes.

- Does it apply cleanly to master: Yes.

- Does it include reasonable tests, doc patches, etc.: No. While
  it would be nice if it had some, a test that inserts 2^32 rows
  will take a while and can hardly be called reasonable.


The patch is designed to expand the size of the "affected records" count 
in the command tag from 32 to 64 bits.


- Does it do that: Yes.

- Do we want that: Yes, because it is motivated by reports from users
  who have queries like that in real life.

- Do we already have it: No.

- Does it follow SQL spec or community-agreed behavior: This is not
  covered by the SQL standard and there has not, to my knowledge, been
  any discussion on this point on -hackers. It is, however, the
  obvious approach to solving the specific issue.

- Does it include pg_dump support: n/a

- Are there dangers: Existing applications and client libraries must
  support the increased maximum size (up to nine additional digits) and
  maximum value. libpq apparently does not parse the command tag, only
  stores it as a string for retrieval by PQcmdStatus(), so it is not
  affected in terms of parsing the value, and for storage, it uses a
  64-character buffer, which will overflow if the command name part of
  the tag exceeds 32 characters (63 - 19 [row count] - 10 [OID] - 2
  [spaces]). The longest command name I can think of is "REFRESH
  MATERIALIZED VIEW" which, at 25 characters, stays comfortably below
  this limit, and does not include a row count anyway.

- Have all the bases been covered: The patch changes all locations
  where the command tag is formatted, and where the record count is
  retrieved by PL/pgSQL.

- Does the patch follow the coding guidelines: I believe so.

- Are there portability issues/Will it work on Windows/BSD etc.:

  No, it will not work correctly on Windows when built with MSVC,
  although it may work with MinGW.

  +++ postgresql-9.5.0/src/backend/tcop/pquery.c
  @@ -195,7 +195,7 @@
   {
case CMD_SELECT:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
  -  "SELECT %u", queryDesc->estate->es_processed);
  +  "SELECT %lu", queryDesc->estate->es_processed);


  %lu formats unsigned long. "long" is problematic in terms of
  portability, because sizeof(long) is different everywhere. It is 32
  bits on Windows and on 32-bit *nix, and 64 bits on 64-bit *nix.

  I added the following line to the INSERT formatting in pquery.c:

queryDesc->estate->es_processed += 471147114711LL;

  This number is 0x6DB28E70D7; so inserting one row should return
  "INSERT 0 2995679448" (0xB28E70D8):

postgres=# insert into t1 values (0);
INSERT 0 2995679448

  To fix this, I think it will be enough to change the format strings to
  use "%zu" instead of "%lu". pg_snprintf() is selected by configure if
  the platform's snprintf() does not support the "z" conversion. I tried
  this, and it appears to work:

postgres=# insert into t1 values (0);
INSERT 0 471147114712

  I have looked for other uses of "%lu", and found none that may cause
  the same issue; apparently they are all used with values that clearly
  have 32-bit type; actually, most of them are used to format error
  codes in Windows-specific code.

- Are the comments sufficient and accurate: Yes.

- Does it do what it says, correctly: Yes, except for the Windows thing.

- Does it produce compiler warnings: No. First, pg_snprintf() does not
  use the system implementation, and second, a warning (C4477) for
  this kind of format string mismatch was only added in VS2015, which
  is not officially supported (it works for me).

- Can you make it crash: No. The problematic argument always appears
  last in the sprintf() calls, so the format string issue should not
  be exploitable.

I did not run the regression tests or do the "performance" sections 
after I found the Windows issue. I do not think it will negatively 
affect performance, though.


In all, if replacing four "l"s with "z"s is indeed enough, I think this 
patch is an appropriate solution for solving the underlying issue.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/pgSQL, get diagnostics and big data

2016-02-09 Thread Christian Ullrich
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:not tested

* Andreas 'ads' Scherbaum wrote:

> one of our customers approached us and complained, that GET DIAGNOSTICS
> row_count returns invalid results if the number of rows is > 2^31. It's

> Attached patch expands the row_count to 64 bit.
>
> diagnostics=# select testfunc_pg((2^32 + 5)::bigint);
>   testfunc_pg
> -
>4295017296
> (1 row)

This is my first patch review, but I have to get my feet wet at some point, and 
this is a nice, small patch to do that.

Following the checklist from the wiki:

- Is the patch in context format: Yes.

- Does it apply cleanly to master: Yes.

- Does it include reasonable tests, doc patches, etc.: No. While
  it would be nice if it had some, a test that inserts 2^32 rows
  will take a while and can hardly be called reasonable.


The patch is designed to expand the size of the "affected records" count in the 
command tag from 32 to 64 bits.

- Does it do that: Yes.

- Do we want that: Yes, because it is motivated by reports from users
  who have queries like that in real life.

- Do we already have it: No.

- Does it follow SQL spec or community-agreed behavior: This is not
  covered by the SQL standard and there has not, to my knowledge, been
  any discussion on this point on -hackers. It is, however, the
  obvious approach to solving the specific issue.

- Does it include pg_dump support: n/a

- Are there dangers: Existing applications and client libraries must
  support the increased maximum size (up to nine additional digits) and
  maximum value. libpq apparently does not parse the command tag, only
  stores it as a string for retrieval by PQcmdStatus(), so it is not
  affected in terms of parsing the value, and for storage, it uses a
  64-character buffer, which will overflow if the command name part of
  the tag exceeds 32 characters (63 - 19 [row count] - 10 [OID] - 2
  [spaces]). The longest command name I can think of is "REFRESH
  MATERIALIZED VIEW" which, at 25 characters, stays comfortably below
  this limit, and does not include a row count anyway.

- Have all the bases been covered: The patch changes all locations
  where the command tag is formatted, and where the record count is
  retrieved by PL/pgSQL.

- Does the patch follow the coding guidelines: I believe so.

- Are there portability issues/Will it work on Windows/BSD etc.:

  No, it will not work correctly on Windows when built with MSVC,
  although it may work with MinGW.

  +++ postgresql-9.5.0/src/backend/tcop/pquery.c
  @@ -195,7 +195,7 @@
   {
 case CMD_SELECT:
 snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
  - "SELECT %u", queryDesc->estate->es_processed);
  + "SELECT %lu", queryDesc->estate->es_processed);


  %lu formats unsigned long. "long" is problematic in terms of
  portability, because sizeof(long) is different everywhere. It is 32
  bits on Windows and on 32-bit *nix, and 64 bits on 64-bit *nix.

  I added the following line to the INSERT formatting in pquery.c:

queryDesc->estate->es_processed += 471147114711LL;

  This number is 0x6DB28E70D7; so inserting one row should return
  "INSERT 0 2995679448" (0xB28E70D8):

postgres=# insert into t1 values (0);
INSERT 0 2995679448

  To fix this, I think it will be enough to change the format strings to
  use "%zu" instead of "%lu". pg_snprintf() is selected by configure if
  the platform's snprintf() does not support the "z" conversion. I tried
  this, and it appears to work:

postgres=# insert into t1 values (0);
INSERT 0 471147114712

  I have looked for other uses of "%lu", and found none that may cause
  the same issue; apparently they are all used with values that clearly
  have 32-bit type; actually, most of them are used to format error
  codes in Windows-specific code.

- Are the comments sufficient and accurate: Yes.

- Does it do what it says, correctly: Yes, except for the Windows thing.

- Does it produce compiler warnings: No. First, pg_snprintf() does not
  use the system implementation, and second, a warning (C4477) for
  this kind of format string mismatch was only added in VS2015, which
  is not officially supported (it works for me).

- Can you make it crash: No. The problematic argument always appears
  last in the sprintf() calls, so the format string issue should not
  be exploitable.

I did not run the regression tests or do the "performance" sections after I 
found the Windows issue. I do not think it will negatively affect performance, 
though.

In all, if replacing four "l"s with "z"s is indeed enough, I think this patch 
is an appropriate solution for solving the underlying issue.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread David G. Johnston
On Tue, Feb 9, 2016 at 11:32 AM, Corey Huinker 
wrote:

>
> Oh, and I suggest we call them SESSION variables rather than SCHEMA
> variables, to reinforce the idea of how long the values in the variables
> live. A session variable is in a sense a 1x1 temp table, whose definition
> persists across sessions but whose value does not.
>
> Of course, if they do persist across sessions, then yeah, SCHEMA makes
> more sense. But every package variable in Oracle PL/SQL was initialized
> when the package was first loaded into the session.
>
>
​The key distinction for SCHEMA was that all functions in the schema would
be able to see them (and only those in the schema).

I am a bit partial, with little deep thought, to the IMPORT mechanic.
Changing them to actual session variables would be doable and you could
allow for the IMPORT specification to use search_path or explicit means to
locate said variables regardless of which schema​

​they exist in.

However, part of the goal is to blend into the broader database community
and thus increase porting capabilities.  I'm not sure how well this would
help fulfill that goal.

David J.
​


Re: [HACKERS] why can the isolation tester handle only one waiting process?

2016-02-09 Thread Andres Freund
On February 9, 2016 7:12:23 PM GMT+01:00, Robert Haas  
wrote:
>On Tue, Sep 8, 2015 at 5:11 PM, Robert Haas 
>wrote:
>> Here's an updated patch series with some more improvements to the
>> isolationtester code, and some better test cases.
>
>OK, here's a final set of patches that I intend to commit in the next
>few days if nobody objects, per discussion on the thread about
>parallelism fixes.  It's basically the same as the previous set, but
>there's a bug fix and an additional test case.

Fwiw, as the person starting the ruckus over there, I think something like 
isolationtester has a much lower need for careful review, consensus et al than 
say lock.c and deadlock.c.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema PL session variables

2016-02-09 Thread Corey Huinker
On Tue, Feb 9, 2016 at 9:58 AM, Pavel Stehule 
wrote:

>
>
> 2016-02-09 15:32 GMT+01:00 Marko Tiikkaja :
>
>> On 08/02/16 14:16, Pavel Stehule wrote:
>>
>>> 2016-02-08 13:53 GMT+01:00 Marko Tiikkaja :
>>>

 Yeah, and that's exactly what I don't want, because that means that
 CREATE
 SCHEMA VARIABLE suddenly breaks existing code.


>>> theoretically yes, but this conflict can be 100% detected - so no quiet
>>> bug
>>> is possible, and plpgsql_check can find this issue well. If you don't use
>>> schema variable, then your code will be correct. You have to explicitly
>>> create the variable, and if there will be any problem, then the problem
>>> will be only in PL functions in one schema. And you can identify it by
>>> statical analyse.
>>>
>>
>> I'm sorry, but I think you've got your priorities completely backwards.
>> You're saying that it's OK to add a footgun because blown-off pieces of
>> feet can be found by using a third party static analyzer barely anyone
>> uses.  And at best, that footgun is only a very minor convenience (though
>> I'd argue that omitting it actually hurts readability).
>>
>
> I don't block the integration plpgsql_check to upstream. I spent hundreds
> hours for it.
>
> Can we look on this problem with different side? What I can do it for safe
> using proposed schema variables.
>
> The possible ways:
>
> 1. requirement prefix like : or @. I don't prefer it because a) hard to
> find a agreement - Oracle fans like ":", MSSQL like @, other maybe $, b)
> with any currently unsupported syntax I have to fix SQL lexer, parser
>
> 2. requirement to use qualified name everywhere - it can works, but I
> don't prefer it, because sometimes can be unfunny to write long qualified
> identifiers. There are not aliases on schema in PLpgSQL. Possible solved by
> variable aliases. But it requires alias.
>
> 3. plpgsql GUC where schema variables are: a) disabled, b) enabled, c)
> only qualified names are allowed - it is similar to #variable_conflict
> option
>
> I prefer @3 with "c" as default, but I can live with @2, and dislike @1
> due mentioned reasons.
>
> Can you be satisfied by any mentioned variant?
>
> Regards
>
> Pavel
>
>
>>
>> That makes absolutely no sense to me at all.
>>
>>
>> .m
>>
>
>
Would it make sense to explicitly import variables in function definitions?

CREATE SESSION VARIABLE foo integer;
CREATE SESSION VARIABLE my_schema.bar text;
SET SESSION VARIABLE foo to 4;
SET SESSION VARIABLE my_schema.bar to 'hi mom';

CREATE FUNCTION my_func (p_param text) returns boolean
LANGUAGE SQL
IMPORT g_foo integer FROM foo,
IMPORT g_textval IN OUT text FROM my_schema.bar

AS $$

SELECT COUNT(*) > 1
FROM my_table
WHERE id = g_foo
AND name = g_textval;
$$;


The IMPORT clause would be something like:

IMPORT local_var_name [IN] [OUT] type FROM [session variable | expression ]


And obviously it would reject importing an expression as an OUT type.
Importing an expression would largely serve the purpose of compile-time
macro, or allowing us to pass parameters into anonymous blocks, something
we've wanted for a while now.

With something like this, the session variables are seen as parameters
inside the function regardless of language and with no new prefix, :, @, or
otherwise.

Oh, and I suggest we call them SESSION variables rather than SCHEMA
variables, to reinforce the idea of how long the values in the variables
live. A session variable is in a sense a 1x1 temp table, whose definition
persists across sessions but whose value does not.

Of course, if they do persist across sessions, then yeah, SCHEMA makes more
sense. But every package variable in Oracle PL/SQL was initialized when the
package was first loaded into the session.


Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?

2016-02-09 Thread Kouhei Kaigai
> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Wednesday, February 10, 2016 1:58 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Jim Nasby; pgsql-hackers@postgresql.org; Amit Langote
> Subject: ##freemail## Re: [HACKERS] Way to check whether a particular block is
> on the shared_buffer?
> 
> On Sun, Feb 7, 2016 at 9:49 PM, Kouhei Kaigai  wrote:
> > On the other hands, it also became clear we have to guarantee OS buffer
> > or storage block must not be updated partially during the P2P DMA.
> > My motivation is a potential utilization of P2P DMA of SSD-to-GPU to
> > filter out unnecessary rows and columns prior to loading to CPU/RAM.
> > It needs to ensure PostgreSQL does not write out buffers to OS buffers
> > to avoid unexpected data corruption.
> >
> > What I want to achieve is suspend of buffer write towards a particular
> > (relnode, forknum, blocknum) pair for a short time, by completion of
> > data processing by GPU (or other external devices).
> > In addition, it is preferable being workable regardless of the choice
> > of storage manager, even if we may have multiple options on top of the
> > pluggable smgr in the future.
> 
> It seems like you just need to take an exclusive content lock on the
> buffer, or maybe a shared content lock would be sufficient.
>
Unfortunately, it was not sufficient.

Due to the assumption, the buffer page to be suspended does not exist
when a backend process issues a series P2P DMA command. (If block would
be already loaded to the shared buffer, it don't need to issue P2P DMA,
but just use usual memory<->device DMA because RAM is much faster than
SSD.)
It knows the pair of (rel,fork,block), but no BufferDesc of this block
exists. Thus, it cannot acquire locks in BufferDesc structure.

Even if the block does not exist at this point, concurrent process may
load the same page. BufferDesc of this page shall be assigned at this
point, however, here is no chance to lock something in BufferDesc for
the process which issues P2P DMA command.

It is the reason why I assume the suspend/resume mechanism shall take
a pair of (rel,fork,block) as identifier of the target block.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 06:46 PM, Andrew Dunstan wrote:



On 02/09/2016 05:53 PM, Tom Lane wrote:




Andrew, I wonder if I could prevail on you to make axolotl run "make
check" on HEAD in src/interfaces/ecpg/ until it fails, so that we can
see if the logging I added tells anything useful about this.





Will do.



Incidentally, as I noted earlier, the ecpg tests don't honour 
TEMP_CONFIG, and in axolotl's case this could well make a difference, as 
it it set up like this:


extra_config =>{
DEFAULT => [
q(log_line_prefix = '%m [%c:%l] '),
"log_connections = 'true'",
"log_disconnections = 'true'",
"log_statement = 'all'",
"fsync = off",
   "stats_temp_directory='/home/andrew/bf/root/stats_temp/$branch'"
],
},

So running it's not running with fsync off or using the ramdisk for 
stats_temp_directory. Of course, that doesn't explain why we're not 
seeing it on branches earlier than 9.5, but it could explain why we're 
only seeing it on the ecpg tests.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: NextXID format change (was Re: [HACKERS] exposing pg_controldata and pg_config as functions)

2016-02-09 Thread Joe Conway
On 01/19/2016 07:04 PM, Michael Paquier wrote:
> On Wed, Jan 20, 2016 at 11:41 AM, Alvaro Herrera
>  wrote:
>> Joe Conway wrote:
>>
>>> The attached includes Bruce's change, plus I found two additional sites
>>> that appear to need the same change. The xlog.c change is just a DEBUG
>>> message, so not a big deal. I'm less certain if the xlogdesc.c change
>>> might create some fallout.
>>
>> Hm, pg_xlogdump links the rmgrdesc files, so perhaps you might need to
>> adjust expected test output for it.  Not really sure.
> 
> We don't depend on this output format in any tests AFAIK, at least
> check-world is not complaining here and pg_xlogdump has no dedicated
> tests. There may be some utility in the outside world doing some
> manipulation of the string generated for this record, but that's not
> worth worrying about anyway.
> 
> Patch looks fine, I have not spotted any other places that need a refresh.

I'll commit the attached tomorrow if there are no other concerns voiced.

In the spirit of the dev meeting discussion, I am trying to use the
commit message template discussed. Something like:

-- email subject limit -
Change delimiter used for display of NextXID

NextXID has been rendered in the form of a pg_lsn even though it
really is not. This can cause confusion, so change the format from
%u/%u to %u:%u, per discussion on hackers.

Complaint by me, patch by me and Bruce, reviewed by Michael Paquier
and Alvaro. Applied to HEAD only.

Reported-by: Joe Conway
Author: Joe Conway, Bruce Momjian
Reviewed-by: Michael Paquier, Alvaro Herrera
Tested-by: Michael Paquier
Backpatch-through: master
-- email subject limit -

That does look pretty redundant though. Thoughts?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index b694dea..2dcbfbd 100644
*** a/src/backend/access/rmgrdesc/xlogdesc.c
--- b/src/backend/access/rmgrdesc/xlogdesc.c
*** xlog_desc(StringInfo buf, XLogReaderStat
*** 43,49 
  		CheckPoint *checkpoint = (CheckPoint *) rec;
  
  		appendStringInfo(buf, "redo %X/%X; "
! 		 "tli %u; prev tli %u; fpw %s; xid %u/%u; oid %u; multi %u; offset %u; "
  		 "oldest xid %u in DB %u; oldest multi %u in DB %u; "
  		 "oldest/newest commit timestamp xid: %u/%u; "
  		 "oldest running xid %u; %s",
--- 43,49 
  		CheckPoint *checkpoint = (CheckPoint *) rec;
  
  		appendStringInfo(buf, "redo %X/%X; "
! 		 "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; "
  		 "oldest xid %u in DB %u; oldest multi %u in DB %u; "
  		 "oldest/newest commit timestamp xid: %u/%u; "
  		 "oldest running xid %u; %s",
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7d5d493..ee87e1b 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*** StartupXLOG(void)
*** 6284,6290 
    (uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
  	wasShutdown ? "TRUE" : "FALSE")));
  	ereport(DEBUG1,
! 			(errmsg_internal("next transaction ID: %u/%u; next OID: %u",
  	checkPoint.nextXidEpoch, checkPoint.nextXid,
  	checkPoint.nextOid)));
  	ereport(DEBUG1,
--- 6284,6290 
    (uint32) (checkPoint.redo >> 32), (uint32) checkPoint.redo,
  	wasShutdown ? "TRUE" : "FALSE")));
  	ereport(DEBUG1,
! 			(errmsg_internal("next transaction ID: %u:%u; next OID: %u",
  	checkPoint.nextXidEpoch, checkPoint.nextXid,
  	checkPoint.nextOid)));
  	ereport(DEBUG1,
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index e7e072f..5dd2dbc 100644
*** a/src/bin/pg_controldata/pg_controldata.c
--- b/src/bin/pg_controldata/pg_controldata.c
*** main(int argc, char *argv[])
*** 252,258 
  		   ControlFile.checkPointCopy.PrevTimeLineID);
  	printf(_("Latest checkpoint's full_page_writes: %s\n"),
  		   ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
! 	printf(_("Latest checkpoint's NextXID:  %u/%u\n"),
  		   ControlFile.checkPointCopy.nextXidEpoch,
  		   ControlFile.checkPointCopy.nextXid);
  	printf(_("Latest checkpoint's NextOID:  %u\n"),
--- 252,258 
  		   ControlFile.checkPointCopy.PrevTimeLineID);
  	printf(_("Latest checkpoint's full_page_writes: %s\n"),
  		   ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
! 	printf(_("Latest checkpoint's NextXID:  %u:%u\n"),
  		   ControlFile.checkPointCopy.nextXidEpoch,
  		   ControlFile.checkPointCopy.nextXid);
  	printf(_("Latest checkpoint's NextOID:  %u\n"),
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index ca706a5..525b82b 100644
*** a/src/bin/pg_resetxlog/pg_resetxlog.c
--- 

Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 07:49 PM, Tom Lane wrote:

Andrew Dunstan  writes:

So running it's not running with fsync off or using the ramdisk for
stats_temp_directory. Of course, that doesn't explain why we're not
seeing it on branches earlier than 9.5, but it could explain why we're
only seeing it on the ecpg tests.

BTW, the evidence we already have from axolotl's last run is that
post-checkpoint shutdown in the ecpg test was pretty quick:

LOG:  database system is shut down at 2016-02-09 16:31:14.784 EST
LOG:  lock files all released at 2016-02-09 16:31:14.817 EST

However, I'd already noted from some other digging in the buildfarm
logs that axolotl's speed seems to vary tremendously.  I do not
know what else you typically run on that hardware, but putting it
under full load might help prove things.




Almost nothing. It's a Raspberry Pi 2B that I got mainly to run a 
buildfarm animal. About the only other thing of note is a very lightly 
configured Nagios instance.


Of course, I could put it under continuous load running a compilation or 
something.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
I wrote:
> ... However, there is something else happening
> on axolotl.  Looking at the HEAD and 9.5 branches, there are three very
> similar failures in the ECPG step within the past 60 days:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2016-02-08%2014%3A49%3A23
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2015-12-15%2018%3A49%3A31
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2015-12-12%2001%3A44%3A39

Oh look, searching with the "failures" BF webpage turns up a fourth such
failure on sungazer:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2016-01-18%2014%3A45%3A15

The plot thickens.  I still have no clue what's going wrong though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-09 Thread Daniel Verite
Shulgin, Oleksandr wrote:

> Most importantly, I'd like to learn of better options than storing the
> whole last_result in psql's pset structure.

I guess that you could, each time a query fails, gather silently the
result of \errverbose, store it in a buffer, discard the PGresult,
and in case the user does \errverbose before running another query,
output what was in that buffer.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 05:53 PM, Tom Lane wrote:




Andrew, I wonder if I could prevail on you to make axolotl run "make
check" on HEAD in src/interfaces/ecpg/ until it fails, so that we can
see if the logging I added tells anything useful about this.





Will do.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
I wrote:
> I'm not sure whether there's anything to be gained by leaving the tracing
> code in there till we see actual buildfarm fails.  There might be another
> slowdown mechanism somewhere, but I rather doubt it.  Thoughts?

Hmmm ... I take that back.  AFAICT, the failures on Noah's AIX zoo are
sufficiently explained by the "mdpostckpt takes a long time after the
regression tests" theory.  However, there is something else happening
on axolotl.  Looking at the HEAD and 9.5 branches, there are three very
similar failures in the ECPG step within the past 60 days:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2016-02-08%2014%3A49%3A23
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2015-12-15%2018%3A49%3A31
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2015-12-12%2001%3A44%3A39

In all three, we got "pg_ctl: server does not shut down", but the
postmaster log claims that it shut down, and pretty speedily too.
For example, in the 2015-12-12 failure,

LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
LOG:  checkpoint starting: shutdown immediate
LOG:  checkpoint complete: wrote 176 buffers (1.1%); 0 transaction log file(s) 
added, 0 removed, 0 recycled; write=0.039 s, sync=0.000 s, total=0.059 s; sync 
files=0, longest=0.000 s, average=0.000 s; distance=978 kB, estimate=978 kB
LOG:  database system is shut down

We have no theory that would account for postmaster shutdown stalling
after the end of ShutdownXLOG, but that seems to be what happened.
How come?  Why does only the ECPG test seem to be affected?

It's also pretty fishy that we have three failures in 60 days on HEAD+9.5
but none before that, and none in the older branches.  That smells like
a recently-introduced bug, though I have no idea what.

Andrew, I wonder if I could prevail on you to make axolotl run "make
check" on HEAD in src/interfaces/ecpg/ until it fails, so that we can
see if the logging I added tells anything useful about this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-09 Thread Michael Paquier
On Wed, Feb 10, 2016 at 2:57 AM, Fujii Masao  wrote:
> On Wed, Feb 10, 2016 at 1:36 AM, Masahiko Sawada  
> wrote:
>> Attached first version dedicated language patch (document patch is not yet.)
>
> Thanks for the patch! Will review it.
>
> I think that it's time to write the documentation patch.
>
> Though I've not read the patch yet, I found that your patch
> changed s_s_names so that it rejects non-alphabet character
> like *, according to my simple test. It should accept any
> application_name which we can use.

Cool. Planning to look at it as well. Could you as well submit a
regression test based on the recovery infrastructure and submit it as
a separate patch? There is a version upthread of such a test but it
would be good to extract it properly.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Existence check for suitable index in advance when concurrently refreshing.

2016-02-09 Thread Michael Paquier
On Wed, Feb 10, 2016 at 2:23 AM, Fujii Masao  wrote:
> On Wed, Feb 10, 2016 at 2:21 AM, Fujii Masao  wrote:
>> On Tue, Feb 9, 2016 at 9:11 PM, Michael Paquier
>>  wrote:
>>> On Tue, Feb 9, 2016 at 4:27 PM, Fujii Masao  wrote:
 Thanks for updating the patch!
 Attached is the updated version of the patch.
 I removed unnecessary assertion check and change of source code
 that you added, and improved the source comment.
 Barring objection, I'll commit this patch.
>>>
>>> So, this code basically duplicates what is already in
>>> refresh_by_match_merge to check if there is a UNIQUE index defined. If
>>> we are sure that an error is detected earlier in the code as done in
>>> this patch, wouldn't it be better to replace the error message in
>>> refresh_by_match_merge() by an assertion?
>>
>> I'm OK with an assertion if we add source comment about why
>> refresh_by_match_merge() can always guarantee that there is
>> a unique index on the matview. Probably it's because the matview
>> is locked with exclusive lock at the start of ExecRefreshMatView(),
>> i.e., it's guaranteed that we cannot drop any indexes on the matview
>> after the first check is passed. Also it might be better to add
>> another comment about that the caller of refresh_by_match_merge()
>> must always check that there is a unique index on the matview before
>> calling that function, just in the case where it's called elsewhere
>> in the future.
>>
>> OTOH, I don't think it's not so bad idea to just emit an error, instead.
>
> Sorry, s/it's not/it's

Well, refresh_by_match_merge is called only by ExecRefreshMatView()
and it is not exposed externally in matview.h, so it is not like we
are going to break any extension-related code by having an assertion
instead of an error message, and that's less code duplication to
maintain if this extra error message is removed. But feel free to
withdraw my comment if you think that's not worth it, I won't deadly
insist on that either :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-09 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 9 Feb 2016 13:31:46 +0900, Michael Paquier  
wrote in 

Re: NextXID format change (was Re: [HACKERS] exposing pg_controldata and pg_config as functions)

2016-02-09 Thread Michael Paquier
On Wed, Feb 10, 2016 at 9:57 AM, Joe Conway  wrote:
> I'll commit the attached tomorrow if there are no other concerns voiced.

Just a nitpick regarding this block:
+   if (strchr(p, '/') != NULL)
+   p = strchr(p, '/');
+   /* delimiter changed from '/' to ':' in 9.6 */
+   else if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
+   p = strchr(p, ':');
+   else
+   p = NULL;
Changing it as follows would save some instructions because there is
no need to call strchr an extra time:
if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
p = strchr(p, ':');
else
p = strchr(p, '/');

> In the spirit of the dev meeting discussion, I am trying to use the
> commit message template discussed. Something like:
>
> -- email subject limit -
> Change delimiter used for display of NextXID
>
> NextXID has been rendered in the form of a pg_lsn even though it
> really is not. This can cause confusion, so change the format from
> %u/%u to %u:%u, per discussion on hackers.
>
> Complaint by me, patch by me and Bruce, reviewed by Michael Paquier
> and Alvaro. Applied to HEAD only.
>
> Reported-by: Joe Conway
> Author: Joe Conway, Bruce Momjian
> Reviewed-by: Michael Paquier, Alvaro Herrera
> Tested-by: Michael Paquier
> Backpatch-through: master
> That does look pretty redundant though. Thoughts?

Removing Tested-by would be fine here I guess, testing and review are
overlapping concepts for this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
Andrew Dunstan  writes:
> Incidentally, as I noted earlier, the ecpg tests don't honour 
> TEMP_CONFIG, and in axolotl's case this could well make a difference, as 
> it it set up like this:
> ...
> So running it's not running with fsync off or using the ramdisk for 
> stats_temp_directory.

Oooohh ...

I had just been looking into what else the postmaster does after "database
system is shut down" comes out.  One of those activities is telling the
stats collector to shut down, and then waiting for it to do so.  So a
really slow write of the collected stats might possibly account for
delaying things here.

What I was doing was adding some more logging in the
post-shutdown-checkpoint area, and this is what I see on dromedary,
when shutting down just after a regression test run:

LOG:  database system is shut down at 2016-02-09 19:12:29.497 EST
LOG:  doing before_shmem_exit 0 at 2016-02-09 19:12:29.498 EST
LOG:  doing on_shmem_exit 2 at 2016-02-09 19:12:29.498 EST
LOG:  doing on_shmem_exit 1 at 2016-02-09 19:12:29.498 EST
LOG:  doing on_shmem_exit 0 at 2016-02-09 19:12:29.498 EST
LOG:  doing on_proc_exit 1 at 2016-02-09 19:12:29.498 EST
LOG:  doing on_proc_exit 0 at 2016-02-09 19:12:29.498 EST
LOG:  calling exit(0) at 2016-02-09 19:12:29.498 EST
LOG:  checkpointer dead at 2016-02-09 19:12:32.382 EST
LOG:  all children dead at 2016-02-09 19:12:32.387 EST
LOG:  doing on_shmem_exit 3 at 2016-02-09 19:12:32.387 EST
LOG:  doing on_shmem_exit 2 at 2016-02-09 19:12:32.387 EST
LOG:  doing on_shmem_exit 1 at 2016-02-09 19:12:32.387 EST
LOG:  doing on_shmem_exit 0 at 2016-02-09 19:12:32.396 EST
LOG:  doing on_proc_exit 1 at 2016-02-09 19:12:32.396 EST
LOG:  doing on_proc_exit 0 at 2016-02-09 19:12:32.396 EST
LOG:  lock files all released at 2016-02-09 19:12:32.397 EST
LOG:  calling exit(0) at 2016-02-09 19:12:32.397 EST

The first batch of those "on_shmem/proc_exit" reports are from the
checkpointer process, and the second set are from the postmaster.
The stats collector shutdown would be between "checkpointer dead"
and "all children dead".  We can see that on this machine, that's
not really an issue ... but how in the world did it take the postmaster
nigh three seconds to notice the checkpoint process exit?  Something
very odd indeed there.

Anyway, I think I should push this additional instrumentation so you
can use it on axolotl.

This also makes it look like we really need to revisit where/when the
"database system is shut down" message is printed.  But that's for
later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
I wrote:
> Anyway, I think I should push this additional instrumentation so you
> can use it on axolotl.

Done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
Andrew Dunstan  writes:
> So running it's not running with fsync off or using the ramdisk for 
> stats_temp_directory. Of course, that doesn't explain why we're not 
> seeing it on branches earlier than 9.5, but it could explain why we're 
> only seeing it on the ecpg tests.

BTW, the evidence we already have from axolotl's last run is that
post-checkpoint shutdown in the ecpg test was pretty quick:

LOG:  database system is shut down at 2016-02-09 16:31:14.784 EST
LOG:  lock files all released at 2016-02-09 16:31:14.817 EST

However, I'd already noted from some other digging in the buildfarm
logs that axolotl's speed seems to vary tremendously.  I do not
know what else you typically run on that hardware, but putting it
under full load might help prove things.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 08:49 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 02/09/2016 07:49 PM, Tom Lane wrote:

However, I'd already noted from some other digging in the buildfarm
logs that axolotl's speed seems to vary tremendously.  I do not
know what else you typically run on that hardware, but putting it
under full load might help prove things.

Almost nothing. It's a Raspberry Pi 2B that I got mainly to run a
buildfarm animal. About the only other thing of note is a very lightly
configured Nagios instance.

Huh, that's quite strange.  There is one fairly recent report of
axolotl failing "make check" because of taking over a minute to shut down:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2015-12-14%2020%3A30%3A52

but the runs before and after that show shutdown times of only a second or
two.



No idea why.

anyway, we got a failure pretty quickly:

   pg_ctl: server does not shut down at 2016-02-09 21:10:11.914 EST

   pg_regress: could not stop postmaster: exit code was 256
   Makefile:81: recipe for target 'check' failed
   make: *** [check] Error 2

The log traces from the shutdown are below.

cheers

andrew



LOG:  received fast shutdown request at 2016-02-09 21:09:11.824 EST
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down at 2016-02-09 21:09:11.830 EST
LOG:  shutting down at 2016-02-09 21:09:11.839 EST
LOG:  checkpoint starting: shutdown immediate
LOG:  CheckPointGuts starting at 2016-02-09 21:09:11.840 EST
LOG:  CheckPointCLOG() done at 2016-02-09 21:09:11.840 EST
LOG:  CheckPointCommitTs() done at 2016-02-09 21:09:11.840 EST
LOG:  CheckPointSUBTRANS() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointMultiXact() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointPredicate() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointRelationMap() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointReplicationSlots() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointSnapBuild() done at 2016-02-09 21:09:11.841 EST
LOG:  CheckPointLogicalRewriteHeap() done at 2016-02-09 21:09:11.842 EST
LOG:  BufferSync(5) beginning to write 172 buffers at 2016-02-09 
21:09:11.845 EST
LOG:  BufferSync(5) done, wrote 172/172 buffers at 2016-02-09 
21:09:14.635 EST

LOG:  CheckPointBuffers() done at 2016-02-09 21:09:14.636 EST
LOG:  CheckPointReplicationOrigin() done at 2016-02-09 21:09:14.636 EST
LOG:  CheckPointGuts done at 2016-02-09 21:09:14.636 EST
LOG:  checkpoint WAL record flushed at 2016-02-09 21:09:14.636 EST
LOG:  pg_control updated at 2016-02-09 21:09:14.637 EST
LOG:  smgrpostckpt() done at 2016-02-09 21:09:14.668 EST
LOG:  RemoveOldXlogFiles() done at 2016-02-09 21:09:14.668 EST
LOG:  TruncateSUBTRANS() done at 2016-02-09 21:09:14.669 EST
LOG:  checkpoint complete: wrote 172 buffers (1.0%); 0 transaction log 
file(s) added, 0 removed, 0 recycled; write=2.794 s, sync=0.000 s, 
total=2.829 s; sync files=0, longest=0.000 s, average=0.000 s; 
distance=966 kB, estimate=966 kB

LOG:  shutdown checkpoint complete at 2016-02-09 21:09:14.669 EST
LOG:  ShutdownCLOG() complete at 2016-02-09 21:09:14.669 EST
LOG:  ShutdownCommitTs() complete at 2016-02-09 21:09:14.669 EST
LOG:  ShutdownSUBTRANS() complete at 2016-02-09 21:09:14.669 EST
LOG:  database system is shut down at 2016-02-09 21:09:14.669 EST
LOG:  doing before_shmem_exit 0 at 2016-02-09 21:09:14.670 EST
LOG:  doing on_shmem_exit 2 at 2016-02-09 21:09:14.670 EST
LOG:  doing on_shmem_exit 1 at 2016-02-09 21:09:14.670 EST
LOG:  doing on_shmem_exit 0 at 2016-02-09 21:09:14.670 EST
LOG:  doing on_proc_exit 1 at 2016-02-09 21:09:14.670 EST
LOG:  doing on_proc_exit 0 at 2016-02-09 21:09:14.671 EST
LOG:  calling exit(0) at 2016-02-09 21:09:14.671 EST
LOG:  checkpointer dead at 2016-02-09 21:09:14.683 EST
LOG:  all children dead at 2016-02-09 21:10:11.184 EST
LOG:  doing on_shmem_exit 3 at 2016-02-09 21:10:11.191 EST
LOG:  doing on_shmem_exit 2 at 2016-02-09 21:10:11.191 EST
LOG:  doing on_shmem_exit 1 at 2016-02-09 21:10:11.192 EST
LOG:  doing on_shmem_exit 0 at 2016-02-09 21:10:11.208 EST
LOG:  doing on_proc_exit 1 at 2016-02-09 21:10:11.209 EST
LOG:  doing on_proc_exit 0 at 2016-02-09 21:10:11.209 EST
LOG:  lock files all released at 2016-02-09 21:10:11.211 EST
LOG:  calling exit(0) at 2016-02-09 21:10:11.211 EST


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Jim Nasby

On 2/8/16 2:45 PM, Tom Lane wrote:

Alvaro Herrera  writes:

Tom Lane wrote:

What I'd like to do to investigate this is put in a temporary HEAD-only
patch that makes ShutdownXLOG() and its subroutines much chattier about
how far they've gotten and what time it is, and also makes pg_ctl print
out the current time if it gives up waiting.



Seems like a reasonable place to start.  I assume you'll be installing
some debug-elog calls, enabled by default, and then once the problem is
fixed, we'll change the default to disabled but keep the actual calls?


I had in mind to just "git revert" the patch when we're done with it.
There might be some parts we want to keep (for instance, I'm thinking of
logging "postmaster is done" after we've unlinked the pidfile, which might
be useful permanently).  But I plan to err on the side of noisiness for
the moment, rather than make something I think has long-term value.


It's already difficult enough for DBAs to debug some performance issues, 
so getting rid of logging is a step backwards. I realize it's unlikely 
that you could run DEBUG2 in a prod environment, but we still shouldn't 
be reducing visibility.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-02-09 Thread Amit Kapila
On Tue, Feb 9, 2016 at 7:26 PM, Thom Brown  wrote:
>
> On 7 January 2016 at 05:24, Amit Kapila  wrote:
> > On Fri, Dec 25, 2015 at 6:36 AM, Robert Haas 
wrote:
> >>
> >> On Wed, Dec 23, 2015 at 1:16 AM, Amit Kapila 
> >> wrote:
> >> >> >
> >> >> > Here procArrayGroupXid sounds like Xid at group level, how about
> >> >> > procArrayGroupMemberXid?
> >> >> > Find the patch with renamed variables for PGProc
> >> >> > (rename_pgproc_variables_v1.patch) attached with mail.
> >> >>
> >> >> I sort of hate to make these member names any longer, but I wonder
if
> >> >> we should make it procArrayGroupClearXid etc.
> >> >
> >> > If we go by this suggestion, then the name will look like:
> >> > PGProc
> >> > {
> >> > ..
> >> > bool procArrayGroupClearXid, pg_atomic_uint32
> >> > procArrayGroupNextClearXid,
> >> > TransactionId procArrayGroupLatestXid;
> >> > ..
> >> >
> >> > PROC_HDR
> >> > {
> >> > ..
> >> > pg_atomic_uint32 procArrayGroupFirstClearXid;
> >> > ..
> >> > }
> >> >
> >> > I think whatever I sent in last patch were better.  It seems to me
it is
> >> > better to add some comments before variable names, so that anybody
> >> > referring them can understand better and I have added comments in
> >> > attached patch rename_pgproc_variables_v2.patch to explain the same.
> >>
> >> Well, I don't know.  Anybody else have an opinion?
> >>
> >
> > It seems that either people don't have any opinion on this matter or
they
> > are okay with either of the naming conventions being discussed.  I think
> > specifying Member after procArrayGroup can help distinguishing which
> > variables are specific to the whole group and which are specific to a
> > particular member.  I think that will be helpful for other places as
well
> > if we use this technique to improve performance.  Let me know what
> > you think about the same.
> >
> > I have verified that previous patches can be applied cleanly and passes
> > make check-world.  To avoid confusion, I am attaching the latest
> > patches with this mail.
>
> Patches still apply 1 month later.
>

Thanks for verification!

>
> I don't really have an opinion on the variable naming.  I guess they
> only need making longer if there's going to be some confusion about
> what they're for,

makes sense, that is the reason why I have added few comments
as well, but not sure if you are suggesting something else.

> but I'm guessing it's not a blocker here.
>

I also think so, but not sure what else is required here.  The basic
idea of this rename_pgproc_variables_v2.patch is to rename
few variables in existing similar code, so that the main patch
group_update_clog can adapt those naming convention if required,
other than that I have handled all review comments raised in this
thread (mainly by Simon and Robert).

Is there anything, I can do to move this forward?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Noah Misch
On Tue, Feb 09, 2016 at 10:02:17PM -0500, Tom Lane wrote:
> Still, it seems clear that the bulk of the shutdown time is indeed the
> stats collector taking its time about shutting down, which is doubly
> weird because the ecpg tests shouldn't have created very many tables,
> so why would there be a lot of data to write?  Even granting that it's
> not writing to ramdisk, 57 seconds to shut down seems pretty excessive.
> 
> I wonder if it's worth sticking some instrumentation into stats
> collector shutdown?

I wouldn't be surprised if the collector got backlogged during the main phase
of testing and took awhile to chew through its message queue before even
starting the write of the final stats.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
Jim Nasby  writes:
> On 2/8/16 2:45 PM, Tom Lane wrote:
>> I had in mind to just "git revert" the patch when we're done with it.

> It's already difficult enough for DBAs to debug some performance issues, 
> so getting rid of logging is a step backwards. I realize it's unlikely 
> that you could run DEBUG2 in a prod environment, but we still shouldn't 
> be reducing visibility.

Meh.  It seems clear to me that we should move the "database system is
shut down" message so that it comes out only after the postmaster has
removed its lockfiles (and hence is really gone so far as any outside
vantage point is concerned).  But I do not think any of the rest of
what I put in has any lasting value.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Tom Lane
Noah Misch  writes:
> On Tue, Feb 09, 2016 at 10:02:17PM -0500, Tom Lane wrote:
>> I wonder if it's worth sticking some instrumentation into stats
>> collector shutdown?

> I wouldn't be surprised if the collector got backlogged during the main phase
> of testing and took awhile to chew through its message queue before even
> starting the write of the final stats.

But why would the ecpg tests show such an effect when the main regression
tests don't?  AFAIK the ecpg tests don't exactly stress the server ---
note the trivial amount of data written by the shutdown checkpoint,
for instance.

The other weird thing is that it's only sometimes slow.  If you look at
the last buildfarm result from axolotl, for instance, the tail end of
the ecpg log is

LOG:  ShutdownSUBTRANS() complete at 2016-02-09 16:31:14.784 EST
LOG:  database system is shut down at 2016-02-09 16:31:14.784 EST
LOG:  lock files all released at 2016-02-09 16:31:14.817 EST

so we only spent ~50ms on stats write that time.

The idea I was toying with is that previous filesystem activity (making
the temp install, the server's never-fsync'd writes, etc) has built up a
bunch of dirty kernel buffers, and at some point the kernel goes nuts
writing all that data.  So the issues we're seeing would come and go
depending on the timing of that I/O spike.  I'm not sure how to prove
such a theory from here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-09 Thread Andrew Dunstan



On 02/09/2016 10:27 PM, Tom Lane wrote:

Noah Misch  writes:

On Tue, Feb 09, 2016 at 10:02:17PM -0500, Tom Lane wrote:

I wonder if it's worth sticking some instrumentation into stats
collector shutdown?

I wouldn't be surprised if the collector got backlogged during the main phase
of testing and took awhile to chew through its message queue before even
starting the write of the final stats.

But why would the ecpg tests show such an effect when the main regression
tests don't?  AFAIK the ecpg tests don't exactly stress the server ---
note the trivial amount of data written by the shutdown checkpoint,
for instance.



The main regression tests run with the stats file on the ramdisk.




The other weird thing is that it's only sometimes slow.  If you look at
the last buildfarm result from axolotl, for instance, the tail end of
the ecpg log is

LOG:  ShutdownSUBTRANS() complete at 2016-02-09 16:31:14.784 EST
LOG:  database system is shut down at 2016-02-09 16:31:14.784 EST
LOG:  lock files all released at 2016-02-09 16:31:14.817 EST

so we only spent ~50ms on stats write that time.



That part is puzzling.


The idea I was toying with is that previous filesystem activity (making
the temp install, the server's never-fsync'd writes, etc) has built up a
bunch of dirty kernel buffers, and at some point the kernel goes nuts
writing all that data.  So the issues we're seeing would come and go
depending on the timing of that I/O spike.  I'm not sure how to prove
such a theory from here.



Yeah. It's faintly possible that a kernel upgrade will  help.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-02-09 Thread Dilip Kumar
On Fri, Feb 5, 2016 at 4:50 PM, Amit Kapila  wrote:

> > Could you also measure how this behaves for an INSERT instead of a COPY
> > workload?
>
>
I think such a test will be useful.
>

I have measured the performance with insert to see the behavior when it
don't use strategy. I have tested multiple option, small tuple, big tuple,
data fits in shared buffer and doesn't fit in shared buffer.

Observation:
--
 Apart from this test I have also used some tool which can take a many
stack traces with some delay.
1. I have observed with base code (when data don't fits in shared buffer)
almost all the stack traces are waiting on "LockRelationForExtension" and
many on "FlushBuffer" also (Flushing the dirty buffer).

Total Stack Captured: 204, FlushBuffer: 13, LockRelationForExtension: 187

 (This test run with 8 thread (shared buf 512MB) and after every 5 second
stack is captured.)

2. If I change shared buf 48GB then Obviously FlushBuffer disappeared but
still LockRelationForExtension remains in very high number.

3.Performance of base code in both the cases when Data fits in shared
buffers or doesn't fits in shared buffer remain very low and non-scaling(we
can see that in below results).

Test--1 (big record insert and Data fits in shared Buffer)

setup

./psql -d postgres -c "create table test_data(a int, b text)"
./psql -d postgres -c "insert into test_data
values(generate_series(1,1000),repeat('x', 1024))"
./psql -d postgres -c "create table data (a int)
with(extend_by_blocks=$variable)" {create table data (a int) for base code}
echo "insert into data select * from test_data;" >> copy_script

test:
-
 shared_buffers=48GB  max_wal_size=20GB  checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres



clientbaseextend_by_block=50extend_by_block=1000
1113115118
450  220216
843  202302


Test--2 (big record insert and Data doesn't fits in shared Buffer)
--
setup:
---
./psql -d postgres -c "create table test_data(a int, b text)"
./psql -d postgres -c "insert into test_data
values(generate_series(1,1000),repeat('x', 1024))"
./psql -d postgres -c "create table data (a int)
with(extend_by_blocks=1000)"
echo "insert into data select * from test_data;" >> copy_script

test:
--
 shared_buffers=512MB  max_wal_size=20GB  checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres


clientbaseextend_by_block=1000
1 125125
4 49  236
8 41  294
1639 279


Test--3 (small record insert and Data fits in shared Buffer)
--
setup:

./psql -d postgres -c "create table test_data(a int)"
./psql -d postgres -c "insert into test_data
values(generate_series(1,1))"
./psql -d postgres -c "create table data (a int) with(extend_by_blocks=20)"
 echo "insert into data select * from test_data;" >> copy_script

test:
-
 shared_buffers=48GB -c max_wal_size=20GB -c checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres


clientbasePatch-extend_by_block=20
1  137  143
2  269  250
4  377  443
8 170   690
16145  745

*All test done with Data on MD and Wal on SSD

Note: Last patch have max limit of extend_by_block=100 so for taking
performance with  extend_by_block=1000 i localy changed it.
I will send the modified patch once we finalize on which approach to
proceed with.


> > I'm doubtful that anything that does the victim buffer search while
> > holding the extension lock will actually scale in a wide range of
> > scenarios.
> >
>
> I think the problem for victim buffer could be visible if the blocks
> are dirty and it needs to write the dirty buffer and especially as
> the patch is written where after acquiring the extension lock, it again
> tries to extend the relation without checking if it can get a page with
> space from FSM.  It seems to me that we should re-check the
> availability of page because while one backend is waiting on extension
> lock, other backend might have added pages.  To re-check the
> availability we might want to use something similar to
> LWLockAcquireOrWait() semantics as used during WAL writing.
>

I will work on this in next version...

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Bug in StartupSUBTRANS

2016-02-09 Thread Michael Paquier
On Wed, Feb 10, 2016 at 3:45 AM, Tom Lane  wrote:
> Simon Riggs  writes:
>> Your patch looks right to me, so I will commit, barring objections... with
>> backpatch. Likely to 9.0, AFAICS.
>
> 9.0 is out of support and should not be patched anymore.
>
> I agree that the patch is basically correct, though I'd personally
> write it without bothering with the extra variable:
>
> +   /* must account for wraparound */
> +   if (startPage > TransactionIdToPage(0x))
> +   startPage = 0;
>
> Also, the comment at line 45 is now wrong and needs an addition.

Instead of using a hardcoded value, wouldn't it be better to use
something based on MaxTransactionId?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] process type escape for log_line_prefix

2016-02-09 Thread Michael Paquier
On Mon, Feb 8, 2016 at 11:32 PM, Andres Freund  wrote:
> Frequently when reading postgres logs to do some post mortem analysis
> I'm left wondering what process emitted an error/log message. After the
> fact it's often hard to know wether an error message was emitted by a
> user backend or by something internal, say the checkpointer or
> autovacuum.  Logging all process starts is often impractical given the
> log volume that causes.
>
> So I'm proposing adding an escape displaying the process title (say 'k'
> for kind?). So %k would emit something like "autovacuum worker process",
> "wal sender process" etc.

It would be nice to get something consistent between the ps output and
this new prefix, say with for example a miscadmin.h parameter like
MyProcName.

> I'm thinking it might make sense to give normal connections "" as the
> name, they're usually already discernible.

Yeah, that sounds fine for me. What about background workers? I would
think that they should use BackgroundWorker->bgw_name.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >