Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-07 Thread Michael Paquier
On Tue, Jul 7, 2015 at 2:19 PM, Josh Berkus j...@agliodbs.com wrote:
 On 07/06/2015 09:56 PM, Michael Paquier wrote:
 On Tue, Jul 7, 2015 at 12:51 PM, Josh Berkus j...@agliodbs.com wrote:
 On 07/06/2015 06:40 PM, Michael Paquier wrote:
 On Tue, Jul 7, 2015 at 2:56 AM, Josh Berkus j...@agliodbs.com wrote:
 pro-JSON:

 * standard syntax which is recognizable to sysadmins and devops.
 * can use JSON/JSONB functions with ALTER SYSTEM SET to easily make
 additions/deletions from the synch rep config.
 * can add group labels (see below)

 If we go this way, I think that managing a JSON blob with a GUC
 parameter is crazy, this is way longer in character size than a simple
 formula because of the key names. Hence, this JSON blob should be in a
 separate place than postgresql.conf not within the catalog tables,
 manageable using an SQL interface, and reloaded in backends using
 SIGHUP.

 I'm not following this at all.  What are you saying here?

 A JSON string is longer in terms of number of characters than a
 formula because it contains key names, and those key names are usually
 repeated several times, making it harder to read in a configuration
 file. So what I am saying that that we do not save it as a GUC, but as
 a separate metadata that can be accessed with a set of SQL functions
 to manipulate it.

 Where, though?  Someone already pointed out the issues with storing it
 in a system catalog, and adding an additional .conf file with a
 different format is too horrible to contemplate.

Something like pg_syncinfo/ coupled with a LW lock, we already do
something similar for replication slots with pg_replslot/.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics / patch v7

2015-07-07 Thread Kyotaro HORIGUCHI
Hi, Tomas. I'll kick the gas pedal.

  Thank you, it looks clearer. I have some comment for the brief look
  at this. This patchset is relatively large so I will comment on
  per-notice basis.. which means I'll send comment before examining
  the entire of this patchset. Sorry in advance for the desultory
  comments.
 
 Sure. If you run into something that's not clear enough, I'm happy to
 explain that (I tried to cover all the important details in the
 comments, but it's a large patch, indeed.)


  - Single-variate stats have a mechanism to inject arbitrary
 values as statistics, that is, get_relation_stats_hook and the
 similar stuffs. I want the similar mechanism for multivariate
 statistics, too.
 
 Fair point, although I'm not sure where should we place the hook, how
 exactly should it be defined and how useful that would be in the
 end. Can you give an example of how you'd use such hook?

It's my secret, but is open:p. this is crucial for us to examine
many planner-related problems occurred in our customer in-vitro.

http://pgdbmsstats.osdn.jp/pg_dbms_stats-en.html

# Mmm, this doc is a bit too old..

One tool of ours does like following, 

- Copy pg_statistics and some attributes of pg_class into some
  table. Of course this is exportable.

- For example, in examine_simple_variable, using the hook
  get_relation_stats_hook, inject the saved statistics in place
  of the real statistics.

The hook point is placed where the parameters to specify what
statistics is needed are avaiable in compact shape, and all the
hook function should do is returning corresponding statistics
values.

So the parallel stuff for this mv stats will look like this.

MVStatisticInfo *
get_mv_statistics(PlannerInfo *root, relid);

or 

MVStatisticInfo *
get_mv_statistics(PlannerInfo *root, relid, bitmap or list of attnos);

So by simplly applying this, the current clauselist_selectivity
code will turn into following.

 if (list_length(clauses) == 1)
return clause_selectivity();
 
 Index varrelid = find_singleton_relid(root, clauses, varRelid);
 
 if (varrelid)
 {
 // Bitmapset attnums = collect_attnums(root, clauses, varrelid);
   if (get_mv_statistics_hook)
 stats = get_mv_statistics_hook(root, varrelid /*, attnums */);
   else
 statis = get_mv_statistics(root, varrelid /*, attnums*/);
 
   

In comparison to single statistics, statistics values might be
preferable to separate from definition.

 I've never used get_relation_stats_hook, but if I get it right, the
 plugins can use the hook to create the stats (for each column), either
 from scratch or tweaking the existing stats.

Mostly existing stats without change. I saw few hackers wanted to
provide predefined statistics for typical cases. I haven't see
anyone who tweaks existing stats.

 I'm not sure how this should work with multivariate stats, though,
 because there can be arbitrary number of stats for a column, and it
 really depends on all the clauses (so examine_variable() seems a bit
 inappropriate, as it only sees a single variable at a time).

Restriction clauses are not a problem. What is needed to replace
stats value is defining few APIs to retrieve them, and to
retrieve the stats values only in a way that compatible with the
API. It would be okay to be a substitute views for mv stats as an
extreme case but it is not good.

 Moreover, with multivariate stats
 
(a) there may be arbitrary number of stats for a column
 
(b) only some of the stats end up being used for the estimation
 
 I see two or three possible places for calling such hook:
 
(a) at the very beginning, after fetching the list of stats
 
- sees all the existing stats on a table
- may add entirely new stats or tweak the existing ones

Getting all stats for a table would be okay but attnum list can
restrict the possibilities, as the second form of the example
APIs above. And we may forget the case of forged or tweaked
stats, they are their problem, not ours.


(b) after collecting the list of variables compatible with
multivariate stats
 
- like (a) and additionally knows which columns are interesting
  for the query (but only with respect to the existing stats)

We should carefully design the API to be able to point the
pertinent stats for every situation. Mv stats is based on the
correlation of multiple columns so I think only relid and
attributes list are enough as the parameter.

| if (st.relid == param.relid  bms_equal(st.attnums, param.attnums))
|/* This is the stats to be wanted  */

If we can filter the appropriate stats from all the stats using
clauselist, we definitely can make the appropriate parameter
(column set) prior to retrieving mv statistics. Isn't it correct?

(c) after optimization (selection of the right combination if stats)
 
- like (b), but can't affect the optimization
 
 But I can't really imagine anyone building multivariate stats on the
 fly, in the hook.
 
 It's more 

Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Sawada Masahiko
On Wed, Jul 8, 2015 at 12:37 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-07-07 16:25:13 +0100, Simon Riggs wrote:
 I don't think pg_freespacemap is the right place.

 I agree that pg_freespacemap sounds like an odd location.

 I'd prefer to add that as a single function into core, so we can write
 formal tests.

 With the advent of src/test/modules it's not really a prerequisite for
 things to be builtin to be testable. I think there's fair arguments for
 moving stuff like pg_stattuple, pg_freespacemap, pg_buffercache into
 core at some point, but that's probably a separate discussion.


I understood.
So I will place bunch of test like src/test/module/visibilitymap_test,
which contains  some tests regarding this feature,
and gather them into one patch.

Regards,

--
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Andres Freund
On 2015-07-07 16:25:13 +0100, Simon Riggs wrote:
 I don't think pg_freespacemap is the right place.

I agree that pg_freespacemap sounds like an odd location.

 I'd prefer to add that as a single function into core, so we can write
 formal tests.

With the advent of src/test/modules it's not really a prerequisite for
things to be builtin to be testable. I think there's fair arguments for
moving stuff like pg_stattuple, pg_freespacemap, pg_buffercache into
core at some point, but that's probably a separate discussion.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Andres Freund
On 2015-07-07 12:03:36 -0400, Peter Eisentraut wrote:
 I think the DN is analogous to the remote user name, which we don't
 expose for any of the other authentication methods.

Huh?

Datum
pg_stat_get_activity(PG_FUNCTION_ARGS)
{
/* Values available to all callers */
values[0] = ObjectIdGetDatum(beentry-st_databaseid);
values[1] = Int32GetDatum(beentry-st_procpid);
values[2] = ObjectIdGetDatum(beentry-st_userid);
...

Isn't that like, essentially, all of them? Sure, if you have a ident
mapping set up, then not, but I have a hard time seing that as a
relevant use case.

 I think the default approach for security and authentication related
 information should be conservative, even if there is not a specific
 reason.  Or to put it another way: What is the motivation for showing
 this information at all?

That we already show equivalent information and that hiding it from
another place will just be crufty and make monitoring setups more
complex.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set of patch to address several Coverity issues

2015-07-07 Thread Andres Freund
On 2015-07-07 16:17:47 +0900, Michael Paquier wrote:
 2) Potential pointer dereference in plperl.c, fixed by 0002 (sent
 previously here =
 CAB7nPqRBCWAXTLw0yBR=bk94cryxu8twvxgyyoxautw08ok...@mail.gmail.com).
 This is related to a change done by transforms. In short,
 plperl_call_perl_func@plperl.c will have a pointer dereference if
 desc-arg_arraytype[i] is InvalidOid. And AFAIK,
 fcinfo-flinfo-fn_oid can be InvalidOid in this code path.

Aren't we in trouble if fn_oid isn't valid at that point? Why would it
be ok for it to be InvalidOid? The function oid is used in lots of
fundamental places, like actually compiling the plperl functions
(compile_plperl_function).

Which path could lead to it validly being InvalidOid?

 3) visibilitymap_truncate and FreeSpaceMapTruncateRel are doing a
 NULL-pointer check on rel-rd_smgr but it has been dereferenced in all
 the code paths leading to those checks. See 0003. For code readability
 mainly.

FWIW, there's actually some reasoning/paranoia behind those
checks. smgrtruncate() sends out immediate smgr sinval messages, which
will invalidate rd_smgr.  Now, I think in both cases there's currently
no way we'll run code between the smgrtruncate() and the if
(rel-rd_smgr) that does a CommandCounterIncrement() causing them to be
replayed locally, but there's some merit in future proofing.

 4) Return result of timestamp2tm is not checked 2 times in
 GetCurrentDateTime and GetCurrentTimeUsec, while all the other 40
 calls of this function do sanity checks. Returning
 ERRCODE_DATETIME_VALUE_OUT_OF_RANGE in case of an error would be good
 for consistency. See 0004. (issue reported previously here
 CAB7nPqRSk=J8eUdd55fL-w+k=8sDTHLVBt-cgG9jWN=vo2o...@mail.gmail.com)

But the other cases accept either arbitrary input or perform timezone
conversions. Not the case here, no?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Peter Eisentraut
On 7/2/15 3:29 PM, Magnus Hagander wrote:
 On Thu, Jul 2, 2015 at 5:40 PM, Peter Eisentraut pete...@gmx.net
 mailto:pete...@gmx.net wrote:
 
 On 6/10/15 2:17 AM, Magnus Hagander wrote:
  AIUI that one was just about the DN field, and not about the rest. If I
  understand you correctly, you are referring to the whole thing, not just
  one field?
 
 I think at least the DN field shouldn't be visible to unprivileged
 users.
 
 What's the argument for that? I mean, the DN field is the equivalent of
 the username, and we show the username in pg_stat_activity already. Are
 you envisioning a scenario where there is actually something secret in
 the DN?

I think the DN is analogous to the remote user name, which we don't
expose for any of the other authentication methods.

 Actually, I think the whole view shouldn't be accessible to unprivileged
 users, except maybe your own row.
 
 
 I could go for some of the others if we think there's reason, but I
 don't understand the dn part?
 
 I guess there's some consistency in actually blocking exactly everything...

I think the default approach for security and authentication related
information should be conservative, even if there is not a specific
reason.  Or to put it another way: What is the motivation for showing
this information at all?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication slot restart_lsn initialization

2015-07-07 Thread Gurjeet Singh
On Tue, Jul 7, 2015 at 6:49 AM, Andres Freund and...@anarazel.de wrote:

 On 2015-07-07 06:41:55 -0700, Gurjeet Singh wrote:
  There seems to be a misplaced not operator  ! in that if statement, as
  well. That sucks :( The MacOS gcc binary is actually clang, and its
 output
  is too noisy [1], which is probably why I might have missed warning if
 any.

 Which version of clang is it? With newer ones you can individually
 disable nearly all of the warnings. I regularly use clang, and most of
 the time it compiles master without warnings.



 $ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.4.0
Thread model: posix

I see that -Wno-parentheses-equality might help, but I am not looking
forward to maintaining OS specific flags for clang-that-pretends-to-be-gcc
in my shell scripts [2]

[2] https://github.com/gurjeet/pgd

Attached is a patch that introduces SlotIsPhyscial/SlotIsLogical macros.
This patch, if accepted, goes on top of the v4 patch.


  pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 {
  Namename = PG_GETARG_NAME(0);
+ boolactivate = PG_GETARG_BOOL(1);
  
   Don't like 'activate' much as a name. How about 'immediately_reserve'?
  
 
  I still like 'activate, but okay. How about 'reserve_immediately'
 instead?

 Activate is just utterly wrong. A slot isn't inactive before. And
 'active' already is used for slots that are currently in use
 (i.e. connected to).

  Also, do you want this name change just in the C code, or in the pg_proc
  and docs as well?

 All.


Patch v4 attached.

On a side note, I see that the pg_create_*_replication_slot() functions do
not behave transactionally; that is, rolling back a transaction does not
undo the slot creation. Should we prevent these, and corresponding drop
functions, from being called inside an explicit transaction?
PreventTransactionChain() is geared towards serving just the utility
commands. Do we protect against calling such functions in a transaction
block, or from user functions? How?

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/


physical_repl_slot_activate_restart_lsn.v4.patch
Description: Binary data


slot_is_physical_or_logical_macros.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-07-07 12:03:36 -0400, Peter Eisentraut wrote:
 I think the DN is analogous to the remote user name, which we don't
 expose for any of the other authentication methods.

 Huh?

Peter's exactly right: there is no other case where you can tell what
some other connection's actual OS username is.  You might *guess* that
it's the same as their database username, but you don't know that,
assuming you don't know how they authenticated.

I'm not sure how security-critical this info really is, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] correct the initdb.log path for pg_regress

2015-07-07 Thread Fujii Masao
On Wed, Jul 8, 2015 at 12:23 AM, David Christensen da...@endpoint.com wrote:
 When encountering an initdb failure in pg_regress, we were displaying the 
 incorrect path to the log file; this commit fixes all 3 places this could 
 occur.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Claudio Freire
On Tue, Jul 7, 2015 at 12:34 PM, Stephen Frost sfr...@snowman.net wrote:
 * Heikki Linnakangas (hlinn...@iki.fi) wrote:
 On 07/07/2015 04:31 PM, Stephen Frost wrote:
 The alternative is to have monitoring tools which are running as
 superuser, which, in my view at least, is far worse.

 Or don't enable fpw_compression for tables where the information
 leak is a problem.

 My hope would be that we would enable FPW compression by default for
 everyone as a nice optimization.  Relegating it to a risky option which
 has to be tweaked on a per-table basis, but only for those tables where
 you don't mind the risk, makes a nice optimization nearly unusable for
 many environments.

No, only tables that have RLS (or the equivalent, like in the case of
pg_authid), where the leak may be meaningful.

The attack requires control over an adjacent (same page) row, but not
over the row being attacked. That's RLS.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Wed, Jul 8, 2015 at 12:34 AM, Stephen Frost sfr...@snowman.net wrote:
 * Heikki Linnakangas (hlinn...@iki.fi) wrote:
 On 07/07/2015 04:31 PM, Stephen Frost wrote:
 The alternative is to have monitoring tools which are running as
 superuser, which, in my view at least, is far worse.

 Or don't enable fpw_compression for tables where the information
 leak is a problem.

 My hope would be that we would enable FPW compression by default for
 everyone as a nice optimization.  Relegating it to a risky option which
 has to be tweaked on a per-table basis, but only for those tables where
 you don't mind the risk, makes a nice optimization nearly unusable for
 many environments.

 I'm not following.  If we don't want the information to be available to
 everyone then we need to limit it, and right now the only way to do that
 is to restrict it to superuser because we haven't got anything more
 granular.
 
 In other words, it seems like your above response about not wanting this
 to be available to anyone except superusers is counter to this last
 sentence where you seem to be saying we should continue to have the
 information available to everyone and simply document that there's a
 risk there as in the proposed patch.

 I don't think we can or should try to hide the current WAL location.
 At least not until we have a monitoring role separate from
 superuserness.

+1

 I'd rather we hide it now, to allow FPW compression to be enabled for
 everyone, except those few environments where it ends up making things
 worse, and then provide the monitoring role in 9.6 which addresses this
 and various other pieces that are currently superuser-only.  We could
 wait, but I think we'd end up discouraging people from using the option
 because of the caveat and we'd then spend years undoing that in the
 future.

So one (ugly) idea is to introduce new setting value like
more_secure_but_might_break_your_monitoring_system (better name? ;))
in wal_compression. If it's specified, literally FPW is compressed and
non-superuser is disallowed to see WAL locaiton. This isn't helpful for
those who need WAL compression but want to allow even non-superuser
to see WAL location, though. Maybe we need to implement also per-table
FPW compression option, to alleviate this situation.

Or another crazy idea is to append random length dummy data into
compressed FPW. Which would make it really hard for an attacker to
guess the information from WAL location. Even if this option is enabled,
you can still have the performance improvement by FPW compression
(of course if dummy data is not so big).

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 alpha: some small comments on BRIN and btree_gin

2015-07-07 Thread Josh Berkus
On 07/07/2015 06:28 AM, Marc Mamin wrote:
 Sure, but on the other hand, they are so small and quick to build 
 that they seem to be a good alternative when other index types are too 
 costly, 
 even if theses indexes can't deal well with all data ranges passed as query 
 condition.
 
 Hence it would be fine if the planner could reject these indexes in the bad 
 cases.

Oh, sorry!  I didn't realize that the planner was using the BRIN index
even when it was useless; your email wasn't clear.

The problem here is that the usefulness of BRIN indexes as a cost
calculation should take correlation into account, heavily.  Can we do
that?  Is correlation even part of the index costing method now?  How
accurate are our correlation estimates?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Stephen Frost
* Fujii Masao (masao.fu...@gmail.com) wrote:
 On Wed, Jul 8, 2015 at 12:34 AM, Stephen Frost sfr...@snowman.net wrote:
  I'd rather we hide it now, to allow FPW compression to be enabled for
  everyone, except those few environments where it ends up making things
  worse, and then provide the monitoring role in 9.6 which addresses this
  and various other pieces that are currently superuser-only.  We could
  wait, but I think we'd end up discouraging people from using the option
  because of the caveat and we'd then spend years undoing that in the
  future.
 
 So one (ugly) idea is to introduce new setting value like
 more_secure_but_might_break_your_monitoring_system (better name? ;))
 in wal_compression. If it's specified, literally FPW is compressed and
 non-superuser is disallowed to see WAL locaiton. This isn't helpful for
 those who need WAL compression but want to allow even non-superuser
 to see WAL location, though. Maybe we need to implement also per-table
 FPW compression option, to alleviate this situation.

I'm not thrilled with that idea, but at the same time, we could have a
generic hide potentially sensitive information option that applies to
all of these things and then admins could set that on their monitoring
role, to allow it to see the data.  That wouldn't get in the way of the
default roles idea either.

 Or another crazy idea is to append random length dummy data into
 compressed FPW. Which would make it really hard for an attacker to
 guess the information from WAL location. Even if this option is enabled,
 you can still have the performance improvement by FPW compression
 (of course if dummy data is not so big).

I'm not sure I'd call that crazy as it's done in other systems..  This
would also help with cases where an attacker can view the WAL length
through other means, so it has some indepdent advantages.

Curious to hear what others think about that approach though.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] replication slot restart_lsn initialization

2015-07-07 Thread Andres Freund
On 2015-07-07 09:42:54 -0700, Gurjeet Singh wrote:
 On a side note, I see that the pg_create_*_replication_slot() functions do
 not behave transactionally; that is, rolling back a transaction does not
 undo the slot creation.

It can't, because otherwise you couldn't run them on a standby.

 Should we prevent these, and corresponding drop functions, from being
 called inside an explicit transaction?  PreventTransactionChain() is
 geared towards serving just the utility commands. Do we protect
 against calling such functions in a transaction block, or from user
 functions? How?

We discussed that when slots where introduced. There seems little
benefit in doing so, and it'll make some legitimate use cases harder.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-07 Thread Merlin Moncure
On Tue, Jul 7, 2015 at 9:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 It doesn't have to if the behavior is guarded with a GUC.  I just
 don't understand what all the fuss is about.  The default behavior of
 logging that is well established by other languages (for example java)
 that manage error stack for you should be to:

 *) Give stack trace when an uncaught exception is thrown
 *) Do not give stack trace in all other logging cases unless asked for

 what is RAISE EXCEPTION - first or second case?

First: RAISE (unless caught) is no different than any other kind of error.

On Tue, Jul 7, 2015 at 9:42 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 07/07/2015 04:56 PM, Merlin Moncure wrote:
 It doesn't have to if the behavior is guarded with a GUC.  I just
 don't understand what all the fuss is about.  The default behavior of
 logging that is well established by other languages (for example java)
 that manage error stack for you should be to:

 *) Give stack trace when an uncaught exception is thrown
 *) Do not give stack trace in all other logging cases unless asked for

 Java's exception handling is so different from PostgreSQL's errors that I
 don't think there's much to be learned from that. But I'll bite:

 First of all, Java's exceptions always contain a stack trace. It's up to you
 when you catch an exception to decide whether to print it or not. try { ...
 } catch (Exception e) { e.printStackTrace() } is fairly common, actually.
 There is nothing like a NOTICE in Java, i.e. an exception that's thrown but
 doesn't affect the control flow. The best I can think of is
 System.out.println(), which of course has no stack trace attached to it.

exactly.

 Perhaps you're arguing that NOTICE is more like printing to stderr, and
 should never contain any context information. I don't think that would be an
 improvement. It's very handy to have the context information available if
 don't know where a NOTICE is coming from, even if in most cases you're not
 interested in it.

That's exactly what I'm arguing.  NOTICE (and WARNING) are for
printing out information to client side logging; it's really the only
tool we have for that purpose and it fits that role perfectly.  Of
course, you may want to have NOTICE print context, especially when
debugging, but some control over that would be nice and in most cases
it's really not necessary.  I really don't understand the objection to
offering control over that behavior although I certainly understand
wanting to keep the default behavior as it currently is.

 This is really quite different from a programming language's exception
 handling. First, there's a server, which produces the errors, and a separate
 client, which displays them. You cannot catch an exception in the client.

 BTW, let me throw in one use case to consider. We've been talking about
 psql, and what to print, but imagine a more sophisticated client like
 pgAdmin. It's not limited to either printing the context or not. It could
 e.g. hide the context information of all messages when they occur, but if
 you double-click on it, it's expanded to show all the context, location and
 all. You can't do that if the server doesn't send the context information in
 the first place.

 I would be happy to show you the psql redirected output logs from my
 nightly server processes that spew into the megabytes because of
 logging various high level steps (did this, did that).

 Oh, I believe you. I understand what the problem is, we're only talking
 about how best to address it.

Yeah.  For posterity, a psql based solution would work fine for me,
but a server side solution has a lot of advantages (less protocol
chatter, more configurability, keeping libpq/psql light).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing latex-longtable value

2015-07-07 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Tue, Jul  7, 2015 at 11:48:13AM -0400, Tom Lane wrote:
 It's a bug.  Back-patch as needed.

 Doesn't that cause translation string differences that are worse than
 the original bug, e.g.:
  psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, 
 html, asciidoc, latex, latex-longtable, troff-ms\n);

No.  When we've discussed this sort of thing in the past, people have been
quite clear that they'd rather have accurate messages that come out in
English than inaccurate-though-localized messages.  Certainly we should
avoid gratuitous changes in back-branch localized strings, but this is a
factual error in the message.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-07 Thread David E. Wheeler
On Jul 7, 2015, at 6:41 AM, Andres Freund and...@anarazel.de wrote:

 At the minimum I'd like to see that CREATE EXTENSION foo; would install
 install extension 'bar' if foo dependended on 'bar' if CASCADE is
 specified. Right now we always error out saying that the dependency on
 'bar' is not fullfilled - not particularly helpful.

+1

If `yum install foo` also installs bar, and `pgxn install foo` downloads, 
builds, and installs bar, it makes sense to me that `CREATE EXTENSION foo` 
would install bar if it was available, and complain if it wasn’t.

Best,

David



smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Missing latex-longtable value

2015-07-07 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I have discovered that psql \pset format does not display
 latex-longtable as a valid value, e.g.:

   test= \pset format kjasdf
   \pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
 latex, troff-ms

 With the attached patch, the latex-longtable value is properly displayed:

   test= \pset format kjasdf
   \pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
 latex, latex-longtable, troff-ms

 Should this be fixed in 9.6 only or 9.5 too?

It's a bug.  Back-patch as needed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Magnus Hagander
On Tue, Jul 7, 2015 at 6:03 PM, Peter Eisentraut pete...@gmx.net wrote:

 On 7/2/15 3:29 PM, Magnus Hagander wrote:
  On Thu, Jul 2, 2015 at 5:40 PM, Peter Eisentraut pete...@gmx.net
  mailto:pete...@gmx.net wrote:
 
  On 6/10/15 2:17 AM, Magnus Hagander wrote:
   AIUI that one was just about the DN field, and not about the rest.
 If I
   understand you correctly, you are referring to the whole thing,
 not just
   one field?
 
  I think at least the DN field shouldn't be visible to unprivileged
  users.
 
  What's the argument for that? I mean, the DN field is the equivalent of
  the username, and we show the username in pg_stat_activity already. Are
  you envisioning a scenario where there is actually something secret in
  the DN?

 I think the DN is analogous to the remote user name, which we don't
 expose for any of the other authentication methods.

  Actually, I think the whole view shouldn't be accessible to
 unprivileged
  users, except maybe your own row.
 
 
  I could go for some of the others if we think there's reason, but I
  don't understand the dn part?
 
  I guess there's some consistency in actually blocking exactly
 everything...

 I think the default approach for security and authentication related
 information should be conservative, even if there is not a specific
 reason.  Or to put it another way: What is the motivation for showing
 this information at all?


To make it accessible to monitoring systems that don't run as superuser
(which should be most monitoring systems, but we have other cases making
that hard as has already been mentioned upthread).

I'm having a hard time trying to figure out a consensus in this thread. I
think there are slightly more arguments for limiting the access though.

The question then is, if we want to hide everything, do we care about doing
the NULL dance, or should we just throw an error for non-superusers
trying to access it?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Claudio Freire
On Tue, Jul 7, 2015 at 2:29 PM, Stephen Frost sfr...@snowman.net wrote:
 Or another crazy idea is to append random length dummy data into
 compressed FPW. Which would make it really hard for an attacker to
 guess the information from WAL location. Even if this option is enabled,
 you can still have the performance improvement by FPW compression
 (of course if dummy data is not so big).

 I'm not sure I'd call that crazy as it's done in other systems..  This
 would also help with cases where an attacker can view the WAL length
 through other means, so it has some indepdent advantages.

 Curious to hear what others think about that approach though.

It's difficult to know whether the randomization would be effective.

For instance, if one were to use a simple linear congruence generator,
it's possible that the known relationship between the added lengths
allows their effect to be accounted for. The parameters of such RNG
can be leaked by attacking a different table fully under the control
of the attacker, generating WAL with known compression ratios, and
comparing resulting WAL size. IIRC, most non-crypto RNGs can be
similarly attacked.

So it would have to be a cryptographically secure RNG to be safe, and
that would be very costly to run during FPW.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing latex-longtable value

2015-07-07 Thread Bruce Momjian
On Tue, Jul  7, 2015 at 11:48:13AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I have discovered that psql \pset format does not display
  latex-longtable as a valid value, e.g.:
 
  test= \pset format kjasdf
  \pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
  latex, troff-ms
 
  With the attached patch, the latex-longtable value is properly displayed:
 
  test= \pset format kjasdf
  \pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
  latex, latex-longtable, troff-ms
 
  Should this be fixed in 9.6 only or 9.5 too?
 
 It's a bug.  Back-patch as needed.

Doesn't that cause translation string differences that are worse than
the original bug, e.g.:

 psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, 
asciidoc, latex, latex-longtable, troff-ms\n);

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Tom Lane
Oskari Saarenmaa o...@ohmu.fi writes:
 I've restricted builds to one at a time on that host to work around this
 issue for now.  Also attached a patch to explicitly set PWD=$(CURDIR) in
 the Makefile to make sure test.sh runs with the right directory.

I've pushed a patch for this issue.  Please revert your buildfarm
configuration so that we can verify it works now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Stephen Frost
* Claudio Freire (klaussfre...@gmail.com) wrote:
 On Tue, Jul 7, 2015 at 12:34 PM, Stephen Frost sfr...@snowman.net wrote:
  * Heikki Linnakangas (hlinn...@iki.fi) wrote:
  On 07/07/2015 04:31 PM, Stephen Frost wrote:
  The alternative is to have monitoring tools which are running as
  superuser, which, in my view at least, is far worse.
 
  Or don't enable fpw_compression for tables where the information
  leak is a problem.
 
  My hope would be that we would enable FPW compression by default for
  everyone as a nice optimization.  Relegating it to a risky option which
  has to be tweaked on a per-table basis, but only for those tables where
  you don't mind the risk, makes a nice optimization nearly unusable for
  many environments.
 
 No, only tables that have RLS (or the equivalent, like in the case of
 pg_authid), where the leak may be meaningful.
 
 The attack requires control over an adjacent (same page) row, but not
 over the row being attacked. That's RLS.

Eh?  I don't recall Heikki's attack requiring RLS and what about
column-level privilege cases where you have access to the row but not to
one of the columns?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Heikki Linnakangas

On 07/07/2015 07:31 PM, Fujii Masao wrote:

Or another crazy idea is to append random length dummy data into
compressed FPW. Which would make it really hard for an attacker to
guess the information from WAL location.


It makes the signal more noisy, but you can still mount the same attack 
if you just average it over more iterations. You could perhaps fuzz it 
enough to make the attack impractical, but it doesn't exactly give me a 
warm fuzzy feeling.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Set of patch to address several Coverity issues

2015-07-07 Thread Michael Paquier
Hi all,

As there have been complaints that it was hard to follow all the small
patches I have sent to fix the issues related to Coverity, here they
are gathered with patches for each one of them:
1) Missing return value checks in jsonfuncs.c, fixed by 0001 (send
here previously =
cab7npqqcj3hu9p7a6vuhomepjkoyqrjxnt1g2f7qy_cq0q8...@mail.gmail.com)
 JsonbIteratorNext is missing a set of (void) in front of its calls.
2) Potential pointer dereference in plperl.c, fixed by 0002 (sent
previously here =
CAB7nPqRBCWAXTLw0yBR=bk94cryxu8twvxgyyoxautw08ok...@mail.gmail.com).
This is related to a change done by transforms. In short,
plperl_call_perl_func@plperl.c will have a pointer dereference if
desc-arg_arraytype[i] is InvalidOid. And AFAIK,
fcinfo-flinfo-fn_oid can be InvalidOid in this code path.
3) visibilitymap_truncate and FreeSpaceMapTruncateRel are doing a
NULL-pointer check on rel-rd_smgr but it has been dereferenced in all
the code paths leading to those checks. See 0003. For code readability
mainly.
4) Return result of timestamp2tm is not checked 2 times in
GetCurrentDateTime and GetCurrentTimeUsec, while all the other 40
calls of this function do sanity checks. Returning
ERRCODE_DATETIME_VALUE_OUT_OF_RANGE in case of an error would be good
for consistency. See 0004. (issue reported previously here
CAB7nPqRSk=J8eUdd55fL-w+k=8sDTHLVBt-cgG9jWN=vo2o...@mail.gmail.com)

Each issue is independent, please feel free to comment.
Regards,
-- 
Michael
From e3f9729de90e37d173de7fce076dc0f0d4362a20 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 7 Jul 2015 16:01:35 +0900
Subject: [PATCH 1/4] Fix missing return value checks for JsonbIteratorNext

Spotted by Coverity.
---
 src/backend/utils/adt/jsonfuncs.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index 13d5b7a..b4258d8 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3362,8 +3362,7 @@ jsonb_delete(PG_FUNCTION_ARGS)
 		{
 			/* skip corresponding value as well */
 			if (r == WJB_KEY)
-JsonbIteratorNext(it, v, true);
-
+(void) JsonbIteratorNext(it, v, true);
 			continue;
 		}
 
@@ -3436,7 +3435,7 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 			if (i++ == idx)
 			{
 if (r == WJB_KEY)
-	JsonbIteratorNext(it, v, true);	/* skip value */
+	(void) JsonbIteratorNext(it, v, true);	/* skip value */
 continue;
 			}
 		}
-- 
2.4.5

From 88a8baa7702c54512af24e72749a12f905bfadff Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 7 Jul 2015 16:03:56 +0900
Subject: [PATCH 2/4] Fix pointer dereference in plperl caused by transforms

Spotted by Coverity.
---
 src/pl/plperl/plperl.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 78baaac..d78cff1 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -2100,8 +2100,11 @@ plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo)
 	PUSHMARK(SP);
 	EXTEND(sp, desc-nargs);
 
-	if (fcinfo-flinfo-fn_oid)
+	if (OidIsValid(fcinfo-flinfo-fn_oid))
+	{
 		get_func_signature(fcinfo-flinfo-fn_oid, argtypes, nargs);
+		Assert(nargs == desc-nargs);
+	}
 
 	for (i = 0; i  desc-nargs; i++)
 	{
@@ -2120,7 +2123,8 @@ plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo)
 
 			if (OidIsValid(desc-arg_arraytype[i]))
 sv = plperl_ref_from_pg_array(fcinfo-arg[i], desc-arg_arraytype[i]);
-			else if ((funcid = get_transform_fromsql(argtypes[i], current_call_data-prodesc-lang_oid, current_call_data-prodesc-trftypes)))
+			else if (OidIsValid(fcinfo-flinfo-fn_oid) 
+(funcid = get_transform_fromsql(argtypes[i], current_call_data-prodesc-lang_oid, current_call_data-prodesc-trftypes)))
 sv = (SV *) DatumGetPointer(OidFunctionCall1(funcid, fcinfo-arg[i]));
 			else
 			{
-- 
2.4.5

From b99a83e052ce10f97af0a574f077167fcc992e6c Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 7 Jul 2015 16:08:15 +0900
Subject: [PATCH 3/4] Remove unnecessary NULL-pointer checks

All the code paths leading to those checks dereference those pointers.
Spotted by Coverity.
---
 src/backend/access/heap/visibilitymap.c   | 3 +--
 src/backend/storage/freespace/freespace.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 7c38772..acd9fde 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -510,8 +510,7 @@ visibilitymap_truncate(Relation rel, BlockNumber nheapblocks)
 	 * invalidate their copy of smgr_vm_nblocks, and this one too at the next
 	 * command boundary.  But this ensures it isn't outright wrong until then.
 	 */
-	if (rel-rd_smgr)
-		rel-rd_smgr-smgr_vm_nblocks = newnblocks;
+	rel-rd_smgr-smgr_vm_nblocks = newnblocks;
 }
 
 /*
diff --git 

Re: [HACKERS] [PATCH] Add missing \ddp psql command

2015-07-07 Thread Fujii Masao
On Tue, Jul 7, 2015 at 2:11 AM, David Christensen da...@endpoint.com wrote:
 Quickie patch for spotted missing psql \ddp tab-completion.

Thanks for the report and patch!

I found that tab-completion was not supported in not only \ddp
but also other psql meta commands like \dE, \dm, \dO, \dy, \s and
\setenv. Attached patch adds all those missing tab-completions.

Thought?

Regards,

-- 
Fujii Masao
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***
*** 897,910  psql_completion(const char *text, int start, int end)
  
  	static const char *const backslash_commands[] = {
  		\\a, \\connect, \\conninfo, \\C, \\cd, \\copy, \\copyright,
! 		\\d, \\da, \\db, \\dc, \\dC, \\dd, \\dD, \\des, \\det, \\deu, \\dew, \\df,
  		\\dF, \\dFd, \\dFp, \\dFt, \\dg, \\di, \\dl, \\dL,
! 		\\dn, \\do, \\dp, \\drds, \\ds, \\dS, \\dt, \\dT, \\dv, \\du, \\dx,
  		\\e, \\echo, \\ef, \\encoding, \\ev,
  		\\f, \\g, \\gset, \\h, \\help, \\H, \\i, \\ir, \\l,
  		\\lo_import, \\lo_export, \\lo_list, \\lo_unlink,
  		\\o, \\p, \\password, \\prompt, \\pset, \\q, \\qecho, \\r,
! 		\\set, \\sf, \\sv, \\t, \\T,
  		\\timing, \\unset, \\x, \\w, \\watch, \\z, \\!, NULL
  	};
  
--- 897,912 
  
  	static const char *const backslash_commands[] = {
  		\\a, \\connect, \\conninfo, \\C, \\cd, \\copy, \\copyright,
! 		\\d, \\da, \\db, \\dc, \\dC, \\dd, \\ddp, \\dD,
! 		\\des, \\det, \\deu, \\dew, \\dE, \\df,
  		\\dF, \\dFd, \\dFp, \\dFt, \\dg, \\di, \\dl, \\dL,
! 		\\dm, \\dn, \\do, \\dO, \\dp, \\drds, \\ds, \\dS,
! 		\\dt, \\dT, \\dv, \\du, \\dx, \\dy,
  		\\e, \\echo, \\ef, \\encoding, \\ev,
  		\\f, \\g, \\gset, \\h, \\help, \\H, \\i, \\ir, \\l,
  		\\lo_import, \\lo_export, \\lo_list, \\lo_unlink,
  		\\o, \\p, \\password, \\prompt, \\pset, \\q, \\qecho, \\r,
! 		\\s, \\set, \\setenv, \\sf, \\sv, \\t, \\T,
  		\\timing, \\unset, \\x, \\w, \\watch, \\z, \\!, NULL
  	};
  
***
*** 3791,3796  psql_completion(const char *text, int start, int end)
--- 3793,3802 
  		COMPLETE_WITH_QUERY(Query_for_list_of_extensions);
  	else if (strncmp(prev_wd, \\dm, strlen(\\dm)) == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
+ 	else if (strncmp(prev_wd, \\dE, strlen(\\dE)) == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_foreign_tables, NULL);
+ 	else if (strncmp(prev_wd, \\dy, strlen(\\dy)) == 0)
+ 		COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers);
  
  	/* must be at end of \d list */
  	else if (strncmp(prev_wd, \\d, strlen(\\d)) == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] two-arg current_setting() with fallback

2015-07-07 Thread Jeevan Chalke
On Fri, Jul 3, 2015 at 2:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jeevan Chalke jeevan.cha...@enterprisedb.com writes:
  Attached patch which fixes my review comments.

 Applied with minor adjustments (mostly cosmetic, but did neither of you
 notice the compiler warning?)


Oops. Sorry for that.
Added  -Wall -Werror in my configuration.

Thanks


 regards, tom lane




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Dependency between bgw_notify_pid and bgw_flags

2015-07-07 Thread Ashutosh Bapat
In CleanupBackgroundWorker(), we seem to differentiate between a background
worker with shared memory access and a backend.

2914 /*
2915  * Additionally, for shared-memory-connected workers, just
like a
2916  * backend, any exit status other than 0 or 1 is considered a
crash
2917  * and causes a system-wide restart.
2918  */

There is an assertion on line 2943 which implies that a backend should have
a database connection.

2939
2940 /* Get it out of the BackendList and clear out remaining data
*/
2941 if (rw-rw_backend)
2942 {
2943 Assert(rw-rw_worker.bgw_flags 
BGWORKER_BACKEND_DATABASE_CONNECTION);

A backend is a process created to handle a client connection [1], which is
always connected to a database. After custom background workers were
introduced we seem to have continued that notion. Hence only the background
workers which request database connection are in BackendList. So, may be we
should continue with that notion and correct the comment as Background
workers that request database connection during registration are in this
list, too. or altogether delete that comment.

With that notion of backend, to fix the original problem I reported,
PostmasterMarkPIDForWorkerNotify() should also look at the
BackgroundWorkerList. As per the comments in the prologue of this function,
it assumes that only backends need to be notified about worker state
change, which looks too restrictive. Any backend or background worker,
which wants to be notified about a background worker it requested to be
spawned, should be allowed to do so.

5655 /*
5656  * When a backend asks to be notified about worker state changes, we
5657  * set a flag in its backend entry.  The background worker machinery
needs
5658  * to know when such backends exit.
5659  */
5660 bool
5661 PostmasterMarkPIDForWorkerNotify(int pid)

PFA the patch which changes PostmasterMarkPIDForWorkerNotify to look at
BackgroundWorkerList as well.

The backends that request to be notified are marked using bgworker_notify,
so that when such a backend dies the notifications to it can be turned off
using BackgroundWorkerStopNotifications(). Now that we allow a background
worker to be notified, we have to build similar flag in RegisteredBgWorker
for the same purpose. The patch does that.
[1]. http://www.postgresql.org/developer/backend/

On Mon, Jun 8, 2015 at 11:22 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Jun 8, 2015 at 1:51 PM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
  Robert Haas wrote:
 
  After studying this, I think it's a bug.  See this comment:
 
   * Normal child backends can only be launched when we are in PM_RUN or
   * PM_HOT_STANDBY state.  (We also allow launch of normal
   * child backends in PM_WAIT_BACKUP state, but only for superusers.)
   * In other states we handle connection requests by launching dead_end
   * child processes, which will simply send the client an error message
 and
   * quit.  (We track these in the BackendList so that we can know when
 they
   * are all gone; this is important because they're still connected to
 shared
   * memory, and would interfere with an attempt to destroy the shmem
 segment,
   * possibly leading to SHMALL failure when we try to make a new one.)
   * In PM_WAIT_DEAD_END state we are waiting for all the dead_end
 children
   * to drain out of the system, and therefore stop accepting connection
   * requests at all until the last existing child has quit (which
 hopefully
   * will not be very long).
 
  That comment seems to imply that, at the very least, all backends with
  shared-memory access need to be part of BackendList.  But really, I'm
  unclear why we'd ever want to exit without all background workers
  having shut down, so maybe they should all be in there.  Isn't that
  sorta the point of this facility?
 
  Alvaro, any thoughts?
 
  As I recall, my thinking was:
 
  * anything connected to shmem that crashes could leave things in bad
  state (for example locks improperly held), whereas things not connected
  to shmem could crash without it being a problem for the rest of the
  system and thus do not require a full restart cycle.  This stuff is
  detected with the PMChildSlot thingy; therefore all bgworkers with shmem
  access should have one of these.
 
  * I don't recall offhand what the criteria is for stuff to be in
  postmaster's BackendList.  According to the comment on top of struct
  Backend, bgworkers connected to shmem should be on that list, even if
  they did not have the BGWORKER_BACKEND_DATABASE_CONNECTION flag on
  registration.  So that would be a bug.
 
  Does this help you any?

 Well, it sounds like we at least need to think about making it
 consistently depend on shmem-access rather than database-connection.
 I'm less sure what to do with workers that have not even got shmem
 access.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company





Re: [HACKERS] Memory Accounting v11

2015-07-07 Thread Jeff Davis
On Tue, 2015-07-07 at 16:49 +1200, David Rowley wrote:

 of performance decrease anywhere. I'm just getting too much variation
 in the test results to get any sort of idea.
 
That was my experience as well. Thank you for taking a look.

 My main question here is: How sure are you that none of your intended
 use cases will need to call MemoryContextMemAllocated with recurse =
 true? I'd imagine if you ever have to
 use MemoryContextMemAllocated(ctx, true) then we'll all be sitting
 around with our stopwatches out to see if the call incurs too much of
 a performance penalty.

I don't expect HashAgg to be using many memory contexts. It did with
array_agg, but that's been changed, and was a bad pattern anyway (one
context per group is quite wasteful).

If it turns out to be slow, we can call it less frequently (e.g.
extrapolate growth rate to determine when to check).

 
 Shouldn't you be setting oldblksize after the if? I'd imagine the
 realloc() failure is rare, but it just seems better to do it just
 before it's required and only when required.

I don't think that's safe. Realloc can return an entirely new pointer,
so the endptr would be pointing outside of the allocation. I'd have to
hang on to the original pointer before the realloc, so I'd need an extra
variable anyway.


 
 Here there's a double assignment of child.

Will fix.

 
 I might be mistaken here, but can't you just set context-mem_allocted
 = 0; after that loop? 
 Or maybe it would be an improvement to only do the decrement
 if MEMORY_CONTEXT_CHECKING is defined, then Assert() that
 mem_allocated == 0 after the loop...
 
OK. Why not just always Assert that?
 
 One other thing that I don't remember seeing discussed would be to
 just store a List in each context which would store all of the
 mem_allocated's that need to be updated during each allocation and
 deallocation of memory. The list would be setup once when the context
 is created. When a child context is setup we'd just loop up the parent
 context chain and list_concat() all of the parent's lists onto the
 child's lists. Would this method add too much overhead when a context
 is deleted? Has this been explored already?
 
That's a good idea, as it would be faster than recursing. Also, the
number of parents can't change, so we can just make an array, and that
would be quite fast to update. Unless I'm missing something, this sounds
like a nice solution. It would require more space in MemoryContextData,
but that might not be a problem.

Regards,
Jeff Davis





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-07 Thread Fujii Masao
On Tue, Jun 30, 2015 at 10:30 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Jun 26, 2015 at 6:26 PM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Jun 25, 2015 at 11:57 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
   3. Add new view 'pg_stat_wait_event' with following info:
   pid   - process id of this backend
   waiting - true for any form of wait, false otherwise
   wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait,
   etc
   wait_event - Lock (Relation), Lock (Relation Extension), etc
  I am pretty unconvinced that it's a good idea to try to split up the
  wait event into two columns.  I'm only proposing ~20 wait states, so
  there's something like 5 bits of information here.  Spreading that
  over two text columns is a lot, and note that Amit's text would
  basically recapitulate the contents of the first column in the second
  one, which I cannot get excited about.
  There is an advantage in splitting the columns which is if
  wait_event_type
  column indicates Heavy Weight Lock, then user can go and check further
  details in pg_locks, I think he can do that even by seeing wait_event
  column, but that might not be as straightforward as with wait_event_type
  column.

 It's just a matter of writing event_type LIKE 'Lock %' instead of
 event_type = 'Lock'.


 Yes that way it can be done and may be that is not inconvenient for user,
 but then there is other type of information which user might need like what
 distinct resources on which wait is possible, which again he can easily find
 with
 different event_type column. I think some more discussion is required before
 we
 could freeze the user interface for this feature, but in the meantime I have
 prepared an initial patch by adding a new column wait_event in
 pg_stat_activity.
 For now, I have added the support for Heavy-Weight locks, Light-Weight locks
 [1]
 and Buffer Cleanup Lock.  I could add for other types (spin lock delay
 sleep, IO,
 network IO, etc.) if there is no objection in the approach used in patch to
 implement
 this feature.

 [1] For LWLocks, currently I have used wait_event as OtherLock for locks
 other than NUM_FIXED_LWLOCKS (Refer function NumLWLocks to see all
 type of LWLocks).  The reason is that there is no straight forward way to
 get
 the id (lockid) of such locks as for some of those (like shared_buffers,
 MaxBackends) the number of locks will depend on run-time configuration
 parameters.  I think if we want to handle those then we could either do some
 math to find out the lockid based on runtime values of these parameters or
 we could add tag in LWLock structure (which indicates the lock type) and
 fill it during Lock initialization or may be some other better way to do it.

 I have still not added documentation and have not changed anything for
 waiting column in pg_stat_activity as I think before that we need to
 finalize
 the user interface.  Apart from that as mentioned above still wait for
 some event types (like IO, netwrok IO, etc.) is not added and also I think
 separate function/'s (like we have for existing ones
 pg_stat_get_backend_waiting)
 will be required which again depends upon user interface.

Yes, we need to discuss what events to track. As I suggested upthread,
Itagaki-san's patch would be helpful to think that. He proposed to track
not only wait event like locking and I/O operation but also CPU events
like query parsing, planning, and etc. I think that tracking even CPU
events would be useful to analyze the performance problem. For example,
if pg_stat_activity reports many times that a large majority of backends
are doing QUERY PLANNING, DBA can think that it might be possible
cause of performance bottleneck and try to check whether the application
uses prepared statements properly.

Here are some review comments on the patch:

When I played around the patch, the test of make check failed.

Each backend reports its event when trying to take a lock. But
the reported event is never reset until next event is reported.
Is this OK? This means that the wait_event column keeps showing
the *last* event while a backend is in idle state, for example.
So, shouldn't we reset the reported event or report another one
when releasing the lock?

+read_string_from_waitevent(uint8 wait_event)

The performance of this function looks poor because its worst case
is O(n): n is the number of all the events that we are trying to track.
Also in pg_stat_activity, this function is called per backend.
Can't we retrieve the event name by using wait event ID as an index
of WaitEventTypeMap array?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory Accounting v11

2015-07-07 Thread David Rowley
On 7 July 2015 at 18:59, Jeff Davis pg...@j-davis.com wrote:

 On Tue, 2015-07-07 at 16:49 +1200, David Rowley wrote:
  I might be mistaken here, but can't you just set context-mem_allocted
  = 0; after that loop?
  Or maybe it would be an improvement to only do the decrement
  if MEMORY_CONTEXT_CHECKING is defined, then Assert() that
  mem_allocated == 0 after the loop...
 
 OK. Why not just always Assert that?
 


Well, I thought the per loop decrement of the mem_allocated was wasteful,
as shouldn't it always get to zero after the final loop anyway?
If so, for efficiency it would be better to zero it after the loop, but if
we do that then we can't assert for zero.


  One other thing that I don't remember seeing discussed would be to
  just store a List in each context which would store all of the
  mem_allocated's that need to be updated during each allocation and
  deallocation of memory. The list would be setup once when the context
  is created. When a child context is setup we'd just loop up the parent
  context chain and list_concat() all of the parent's lists onto the
  child's lists. Would this method add too much overhead when a context
  is deleted? Has this been explored already?
 
 That's a good idea, as it would be faster than recursing. Also, the
 number of parents can't change, so we can just make an array, and that
 would be quite fast to update. Unless I'm missing something, this sounds
 like a nice solution. It would require more space in MemoryContextData,
 but that might not be a problem.


Oh an array is even better, but why more space? won't it just be a pointer
to an array? It can be NULL if there's nothing to track. Sounds like it
would be the same amount of space, at least on a 64 bit machine.

One thing to keep in mind is that I think if a parent is tracking memory,
then a child will also need to track memory too, as when the child context
is deleted, we'll need to decrement the parent's mem_allocated by that of
all its child contexts

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: [HACKERS] Parallel Seq Scan

2015-07-07 Thread Jeff Davis
On Tue, 2015-07-07 at 09:27 +0530, Amit Kapila wrote:

 I am not sure how many blocks difference could be considered okay for
 deviation?

In my testing (a long time ago) deviations of tens of blocks didn't show
a problem.

However, an assumption of the sync scan work was that the CPU is
processing faster than the IO system; whereas the parallel scan patch
assumes that the IO system is faster than a single core. So perhaps the
features are incompatible after all. Only testing will say for sure.

Then again, syncscans are designed in such a way that they are unlikely
to hurt in any situation. Even if the scans diverge (or never converge
in the first place), it shouldn't be worse than starting at block zero
every time.

I'd prefer to leave syncscans intact for parallel scans unless you find
a reasonable situation where they perform worse. This shouldn't add any
complexity to the patch (if it does, let me know).

Regards,
Jeff Davis





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New functions

2015-07-07 Thread Michael Paquier
On Mon, Mar 23, 2015 at 12:46 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Mar 23, 2015 at 1:48 AM, Воронин Дмитрий
 carriingfat...@yandex.ru wrote:

  Please, attach new version of my patch to commitfest page.

 Michael, I found a way to attach patch. sorry to trouble.

 Cool. Thanks. I am seeing your patch entry here:
 https://commitfest.postgresql.org/5/192/
 I'll try to take a look at it for the next commit fest, but please do
 not expect immediate feedback things are getting wrapped up for 9.5.

OK, I am back on this patch and I am just looking at it. There are a
couple of issues that need to be fixed, and I will send back feedback
and a new patch by tomorrow.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] raw output from copy

2015-07-07 Thread Pavel Stehule
Hi

previous patch was broken, and buggy

Here is new version with fixed upload and more tests

The interesting is so I should not to modify interface or client - so it
should to work with any current driver with protocol support = 3.

Regards

Pavel



2015-07-06 23:34 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi

 here is a version with both direction support.

 postgres=# copy foo from '/tmp/1.jpg' (format raw);
 COPY 1
 Time: 93.021 ms
 postgres=# \dt+ foo
List of relations
 ┌┬──┬───┬───┬┬─┐
 │ Schema │ Name │ Type  │ Owner │  Size  │ Description │
 ╞╪══╪═══╪═══╪╪═╡
 │ public │ foo  │ table │ pavel │ 256 kB │ │
 └┴──┴───┴───┴┴─┘
 (1 row)

 postgres=# \copy foo to '~/3.jpg' (format raw)
 COPY 1
 Time: 2.401 ms

 Regards

 Pavel

 2015-07-02 17:02 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Andrew Dunstan and...@dunslane.net writes:
  Does the COPY line protocol even support binary data?

 The protocol, per se, just transmits a byte stream.  There is a field
 in the CopyInResponse/CopyOutResponse messages that indicates whether
 a text or binary copy is being done.  One thing we'd have to consider
 is whether raw mode is sufficiently different from binary to justify
 an additional value for this field, and if so whether that constitutes
 a protocol break.

 IIRC, psql wouldn't really care; it just transfers the byte stream to or
 from the target file, regardless of text or binary mode.  But there might
 be other client libraries that are smarter and expect binary mode to
 mean the binary file format specified in the COPY reference page.  So
 there may be value in being explicit about raw mode in these messages.

 A key point in all this is that people who need raw transfer probably
 need it in both directions, a point that your SELECT proposal cannot
 satisfy, but hacking COPY could.  So I lean towards the latter really.

 regards, tom lane



diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
new file mode 100644
index 2850b47..5739158
*** a/doc/src/sgml/ref/copy.sgml
--- b/doc/src/sgml/ref/copy.sgml
*** COPY { replaceable class=parameterta
*** 190,196 
Selects the data format to be read or written:
literaltext/,
literalcsv/ (Comma Separated Values),
!   or literalbinary/.
The default is literaltext/.
   /para
  /listitem
--- 190,196 
Selects the data format to be read or written:
literaltext/,
literalcsv/ (Comma Separated Values),
!   literalbinary/ or literalraw/literal.
The default is literaltext/.
   /para
  /listitem
*** OIDs to be shown as null if that ever pr
*** 881,886 
--- 881,918 
  /para
 /refsect3
/refsect2
+ 
+   refsect2
+  titleRaw Format/title
+ 
+para
+ The literalraw/literal format option causes all data to be
+ stored/read as binary format rather than as text. It shares format
+ for data with literalbinary/literal format. This format doesn't
+ use any metadata - only row data in network byte order are exported
+ or imported.
+/para
+ 
+para
+ Because this format doesn't support any delimiter, only one value
+ can be exported or imported. NULL values are not allowed.
+/para
+para
+ The literalraw/literal format can be used for export or import
+ bytea values.
+ programlisting
+ COPY images(data) FROM '/usr1/proj/img/01.jpg' (FORMAT raw);
+ /programlisting
+ It can be used successfully for export XML in different encoding
+ or import valid XML document with any supported encoding:
+ screen![CDATA[
+ SET client_encoding TO latin2;
+ 
+ COPY (SELECT xmlelement(NAME data, 'Hello')) TO stdout (FORMAT raw);
+ ?xml version=1.0 encoding=LATIN2?dataHello/data
+ ]]/screen
+/para
+   /refsect2
   /refsect1
  
   refsect1
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
new file mode 100644
index 8904676..c69854c
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
*** typedef enum EolType
*** 92,97 
--- 92,102 
   * it's faster to make useless comparisons to trailing bytes than it is to
   * invoke pg_encoding_mblen() to skip over them. encoding_embeds_ascii is TRUE
   * when we have to do it the hard way.
+  *
+  * COPY supports three modes: text, binary and raw. The text format is plain
+  * text multiline format with specified delimiter. The binary format holds
+  * metadata (numbers, sizes) and data. The raw format holds data only and
+  * only one non NULL value can be processed.
   */
  typedef struct CopyStateData
  {
*** typedef struct CopyStateData
*** 113,118 
--- 118,124 
  	char	   *filename;		/* filename, or NULL for STDIN/STDOUT */
  	bool		is_program;		/* is 'filename' a program to popen? */
  	bool		

[HACKERS] Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

2015-07-07 Thread Fabrízio de Royes Mello
On Fri, Jul 3, 2015 at 9:29 AM, Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote:


 On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas robertmh...@gmail.com
wrote:
 
  On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
  
http://www.postgresql.org/message-id/ca+tgmozm+-0r7h0edpzzjbokvvq+gavkchmno4fypveccw-...@mail.gmail.com
 
  I like the idea of the feature a lot, but the proposal to which you
  refer here mentions some problems, which I'm curious how you think you
  might solve.  (I don't have any good ideas myself, beyond what I
  mentioned there.)
 

 You're right, and we have another design (steps 1 and 2 was implemented
last year):


 *** ALTER TABLE changes

 1) ATController
 1.1) Acquire an AccessExclusiveLock
(src/backend/commands/tablecmds.c - AlterTableGetLockLevel:3023)


 2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -
ATPrepCmd:3249-3270)
 • check temp table (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11074)
 • check foreign key constraints (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11102)


 3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync
(MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
exists)


 4) Create a new fork called  TRANSIENT INIT FORK:

 • from Unlogged to Logged  (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
   ∘ new forkName (src/common/relpath.c) called _initl
   ∘ insert XLog record to drop it if transaction abort

 • from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
   ∘ new forkName (src/common/relpath.c) called _initu
   ∘ insert XLog record to drop it if transaction abort


 AFAIU, while reading WAL, the server doesn't know whether the transaction
that produced that WAL record aborted or committed. It's only when it sees
a COMMIT/ABORT record down the line, it can confirm whether the transaction
committed or aborted. So, one can only redo the things that WAL tells
have been done. We can not undo things based on the WAL. We might record
this fact somewhere while reading the WAL and act on it once we know the
status of the transaction, but I do not see that as part of this idea. This
comment applies to all the steps inserting WALs for undoing things.



Even if I xlog the create/drop of the transient fork?



 5) Change the relpersistence in catalog (pg_class-relpersistence)
(heap, toast, indexes)


 6) Remove/Create INIT_FORK

 • from Unlogged to Logged
   ∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the
pendingDeletes queue
 • from Logged to Unlogged
   ∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
   ∘ create the INIT_FORK using heap_create_init_fork
   ∘ insert XLog record to drop init fork if the transaction abort



 *** CRASH RECOVERY changes

 1) During crash recovery
(src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)


 This operation is carried out in two phases: one before replaying WAL
records and second after that. Are you sure that the WALs generated for the
unlogged or logged forks, as described above, have been taken care of?


Hummm... actually no...




 • if the transient fork _initl exists then
   ∘ drop the transient fork _initl
   ∘ if the init fork doesn't exist then create it
   ∘ reset relation
 • if the transient fork _initu exists then
   ∘ drop the transient fork _initl
   ∘ if the init fork exists then drop it
   ∘ don't reset the relation


 Consider case of converting unlogged to logged. The server crashes after
6th step when both the forks are removed. During crash recovery, it will
not see any of the fork and won't reset the unlogged table. Remember the
table is still unlogged since the transaction was aborted due to the crash.
So, it has to have an init fork to be reset on crash recovery.

 Similarly while converting from logged to unlogged. The server crashes in
the 6th step after creating the INIT_FORK, during crash recovery the init
fork will be seen and a supposedly logged table will be trashed.



My intention for the 6th step is all recorded to wal, so if a crash occurs
the recovery process clean the mess.


 The ideas in 1 and 2 might be better than having yet another init fork.


The link for idea 2 is missing...



 1. http://www.postgresql.org/message-id/533d457a.4030...@vmware.com


Unfortunately I completely missed this idea, but is also interesting. But
I'll need to stamp in both ways (from unlogged to logged and vice-versa)

During the recovery to check the status of a transaction can I use
src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking
because I don't know this part of code to much.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: 

Re: [HACKERS] More logging for autovacuum

2015-07-07 Thread Sawada Masahiko
On Sat, Jul 4, 2015 at 2:30 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Jul 2, 2015 at 4:41 AM, Gurjeet Singh gurj...@singh.im wrote:

 On Fri, Aug 17, 2007 at 3:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 Gregory Stark wrote:
 
  I'm having trouble following what's going on with autovacuum and I'm
  finding
  the existing logging insufficient. In particular that it's only logging
  vacuum
  runs *after* the vacuum finishes makes it hard to see what vacuums are
  running
  at any given time. Also, I want to see what is making autovacuum decide
  to
  forgo vacuuming a table and the log with that information is at DEBUG2.

 So did this idea go anywhere?


 Assuming the thread stopped here, I'd like to rekindle the proposal.

 log_min_messages acts as a single gate for everything headed for the
 server logs; controls for per-background process logging do not exist. If
 one wants to see DEBUG/INFO messages for just the Autovacuum (or
 checkpointer, bgwriter, etc.), they have to set log_min_messages to that
 level, but the result would be a lot of clutter from other processes to
 grovel through, to see the messages of interest.


 I think that will be quite helpful.  During the patch development of
 parallel sequential scan, it was quite helpful to see the LOG messages
 of bgworkers, however one of the recent commits (91118f1a) have
 changed those to DEBUG1, now if I have to enable all DEBUG1
 messages, then what I need will be difficult to find in all the log
 messages.
 Having control of separate logging for background tasks will serve such
 a purpose.

 The facilities don't yet exist, but it'd be nice if such parameters when
 unset (ie NULL) pick up the value of log_min_messages. So by default, the
 users will get the same behaviour as today, but can choose to tweak per
 background-process logging when needed.


 Will this proposal allow user to see all the messages by all the background
 workers or will it be per background task.  Example in some cases user
 might want to see the messages by all bgworkers and in some cases one
 might want to see just messages by AutoVacuum and it's workers.

 I think here designing User Interface is an important work if others also
 agree
 that such an option is useful, could you please elaborate your proposal?

+1

I sometime want to set log_min_messages per process, especially when
less than DEBUG level log is needed.
It's not easy to set log level to particular process from immediately
after beginning of launch today.

Regards,

--
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More logging for autovacuum

2015-07-07 Thread Sawada Masahiko
On Sat, Jul 4, 2015 at 2:30 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Jul 2, 2015 at 4:41 AM, Gurjeet Singh gurj...@singh.im wrote:

 On Fri, Aug 17, 2007 at 3:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 Gregory Stark wrote:
 
  I'm having trouble following what's going on with autovacuum and I'm
  finding
  the existing logging insufficient. In particular that it's only logging
  vacuum
  runs *after* the vacuum finishes makes it hard to see what vacuums are
  running
  at any given time. Also, I want to see what is making autovacuum decide
  to
  forgo vacuuming a table and the log with that information is at DEBUG2.

 So did this idea go anywhere?


 Assuming the thread stopped here, I'd like to rekindle the proposal.

 log_min_messages acts as a single gate for everything headed for the
 server logs; controls for per-background process logging do not exist. If
 one wants to see DEBUG/INFO messages for just the Autovacuum (or
 checkpointer, bgwriter, etc.), they have to set log_min_messages to that
 level, but the result would be a lot of clutter from other processes to
 grovel through, to see the messages of interest.


 I think that will be quite helpful.  During the patch development of
 parallel sequential scan, it was quite helpful to see the LOG messages
 of bgworkers, however one of the recent commits (91118f1a) have
 changed those to DEBUG1, now if I have to enable all DEBUG1
 messages, then what I need will be difficult to find in all the log
 messages.
 Having control of separate logging for background tasks will serve such
 a purpose.


+1

I sometime want to set log_min_messages per process, especially when
less than DEBUG level log is needed.
It's not easy to set log level to particular process from immediately
after beginning of launch today.

Regards,

--
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics / patch v7

2015-07-07 Thread Tomas Vondra

Hi,

On 07/07/2015 08:05 AM, Kyotaro HORIGUCHI wrote:

Hi, Tomas. I'll kick the gas pedal.


Thank you, it looks clearer. I have some comment for the brief look
at this. This patchset is relatively large so I will comment on
per-notice basis.. which means I'll send comment before examining
the entire of this patchset. Sorry in advance for the desultory
comments.


Sure. If you run into something that's not clear enough, I'm happy to
explain that (I tried to cover all the important details in the
comments, but it's a large patch, indeed.)




- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.


Fair point, although I'm not sure where should we place the hook,
how exactly should it be defined and how useful that would be in
the end. Can you give an example of how you'd use such hook?


...



We should carefully design the API to be able to point the pertinent
stats for every situation. Mv stats is based on the correlation of
multiple columns so I think only relid and attributes list are
enough as the parameter.

| if (st.relid == param.relid  bms_equal(st.attnums, param.attnums))
|/* This is the stats to be wanted  */

If we can filter the appropriate stats from all the stats using
clauselist, we definitely can make the appropriate parameter (column
set) prior to retrieving mv statistics. Isn't it correct?


Let me briefly explain how the current clauselist_selectivity 
implementation works.


  (1) check if there are multivariate statistics on the table - if not,
  skip the multivariate parts altogether (the point of this is to
  minimize impact on users who don't use the new feature)

  (2) see if the are clauses compatible with multivariate stats - this
  only checks general compatibility without actually checking the
  existing stats (the point is to terminate early, if the clauses
  are not compatible somehow - e.g. if the clauses reference only a
  single attribute, use unsupported operators etc.)

  (3) if there are multivariate stats and compatible clauses, the
  function choose_mv_stats tries to find the best combination of
  multivariate stats with respect to the clauses (details later)

  (4) the clauses are estimated using the stats, the remaining clauses
  are estimated using the current statistics (single attribute)

The only way to reliably inject new stats is by calling a hook before 
(1), allowing it to arbitrarily modify the list of stats. Based on the 
use cases you provided, I don't think it makes much sense to add 
additional hooks in the other phases.


At this place it's however now known what clauses are compatible with 
multivariate stats, or what attributes they are referencing. It might be 
possible to simply call pull_varattnos() and pass it to the hook, except 
that does not work with RestrictInfo :-/


Or maybe we could / should not put the hook into clauselist_selectivity 
but somewhere else? Say, to get_relation_info where we actually read the 
list of stats for the relation?






0001:

- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.


Yes. In my working git repo, I've reworked this to use the second
option, i.e. adding RestrictInfo pull_(varno|varattno)_walker:

https://github.com/tvondra/postgres/commit/2dc79b914c759d31becd8ae670b37b79663a595f

Do you think this is the correct solution? If not, how to fix it?


The reason why I think it is not appropreate is that RestrictInfo
is not a part of expression.

Increasing selectivity of a condition by column correlation is
occurs only for a set of conjunctive clauses. OR operation
devides the sets. Is it agreeable? RestrictInfos can be nested
each other and we should be aware of the AND/OR operators. This
is what expression_tree_walker doesn't.


I still don't understand why you think we need to differentiate between 
AND and OR operators. There's nothing wrong with estimating OR clauses 
using multivariate statistics.




Perhaps we should provide the dedicate function such like
find_conjunctive_attr_set which does this,


Perhaps. The reason why I added support for RestrictInfo into the 
existing walker implementations is that it seemed like the easiest way 
to fix the issue. But if there are reasons why that's incorrect, then 
inventing a new function is probably the right way.




- Check the type top expression of the clause

   - If it is a RestrictInfo, check clause_relids then check
 clause.



   - If it is a bool OR, stop to search and return empty set of
 attributes.



   - If it is a bool AND, make further check of the components. A
 list of RestrictInfo should be treaed as AND connection.



   - If it is operator exression, collect used relids and attrs
 walking the expression tree.



I should missing something 

Re: [HACKERS] Missing latex-longtable value

2015-07-07 Thread Bruce Momjian
On Tue, Jul  7, 2015 at 01:05:09PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Tue, Jul  7, 2015 at 11:48:13AM -0400, Tom Lane wrote:
  It's a bug.  Back-patch as needed.
 
  Doesn't that cause translation string differences that are worse than
  the original bug, e.g.:
   psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, 
  html, asciidoc, latex, latex-longtable, troff-ms\n);
 
 No.  When we've discussed this sort of thing in the past, people have been
 quite clear that they'd rather have accurate messages that come out in
 English than inaccurate-though-localized messages.  Certainly we should
 avoid gratuitous changes in back-branch localized strings, but this is a
 factual error in the message.

OK, good to know.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Oskari Saarenmaa
07.07.2015, 19:50, Tom Lane kirjoitti:
 Oskari Saarenmaa o...@ohmu.fi writes:
 I've restricted builds to one at a time on that host to work around this
 issue for now.  Also attached a patch to explicitly set PWD=$(CURDIR) in
 the Makefile to make sure test.sh runs with the right directory.
 
 I've pushed a patch for this issue.  Please revert your buildfarm
 configuration so that we can verify it works now.

Ok, just reverted the configuration change and started two test runs,
they're now using correct directories.

Thanks!
Oskari


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Josh Berkus
On 07/07/2015 11:29 AM, Stephen Frost wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 On 07/07/2015 09:06 AM, Magnus Hagander wrote:

 To make it accessible to monitoring systems that don't run as superuser
 (which should be most monitoring systems, but we have other cases making
 that hard as has already been mentioned upthread). 

 I'm having a hard time trying to figure out a consensus in this thread.
 I think there are slightly more arguments for limiting the access though.

 The question then is, if we want to hide everything, do we care about
 doing the NULL dance, or should we just throw an error for
 non-superusers trying to access it?

 I'm going to vote against blocking the entire view for non-superusers.
 One of the things people will want to monitor is are all connection
 from subnet X using SSL? which is most easily answered by joining
 pg_stat_activity and pg_stat_ssl.

 If we force users to use superuser privs to find this out, then we're
 encouraging them to run monitoring as superuser, which is something we
 want to get *away* from.
 
 I agree with all of this, but I'm worried that if we make it available
 now then we may not be able to hide it later, even once we have the
 monitoring role defined, because of backwards compatibility concerns.
 
 If we aren't worried about that, then perhaps we can leave it less
 strict for 9.5 and then make it stricter for 9.6.
 
 I'd be OK with concealing some columns:

 postgres=# select * from pg_stat_ssl;
  pid | ssl | version |   cipher| bits | compression
 | clientdn
 -+-+-+-+--+-+--
   37 | t   | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 |  256 | f   |

 I can see NULLifying cipher and DN columns.  The other columns, it's
 hard to imagine what use an attacker could put them to that they
 wouldn't be able to find out the same information easily using other routes.
 
 Perhaps not, but I'm not sure how useful those columns would be to a
 monitoring system either..  I'd rather keep it simple.

So what about making just pid, ssl and compression available?  That's
mostly what anyone would want to monitor, except for periodic security
audits.  Audits could be done by superuser, no problem.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Josh Berkus
On 07/07/2015 09:06 AM, Magnus Hagander wrote:
 
 To make it accessible to monitoring systems that don't run as superuser
 (which should be most monitoring systems, but we have other cases making
 that hard as has already been mentioned upthread). 
 
 I'm having a hard time trying to figure out a consensus in this thread.
 I think there are slightly more arguments for limiting the access though.
 
 The question then is, if we want to hide everything, do we care about
 doing the NULL dance, or should we just throw an error for
 non-superusers trying to access it?

I'm going to vote against blocking the entire view for non-superusers.
One of the things people will want to monitor is are all connection
from subnet X using SSL? which is most easily answered by joining
pg_stat_activity and pg_stat_ssl.

If we force users to use superuser privs to find this out, then we're
encouraging them to run monitoring as superuser, which is something we
want to get *away* from.

I'd be OK with concealing some columns:

postgres=# select * from pg_stat_ssl;
 pid | ssl | version |   cipher| bits | compression
| clientdn
-+-+-+-+--+-+--
  37 | t   | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 |  256 | f   |

I can see NULLifying cipher and DN columns.  The other columns, it's
hard to imagine what use an attacker could put them to that they
wouldn't be able to find out the same information easily using other routes.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 On 07/07/2015 09:06 AM, Magnus Hagander wrote:
  
  To make it accessible to monitoring systems that don't run as superuser
  (which should be most monitoring systems, but we have other cases making
  that hard as has already been mentioned upthread). 
  
  I'm having a hard time trying to figure out a consensus in this thread.
  I think there are slightly more arguments for limiting the access though.
  
  The question then is, if we want to hide everything, do we care about
  doing the NULL dance, or should we just throw an error for
  non-superusers trying to access it?
 
 I'm going to vote against blocking the entire view for non-superusers.
 One of the things people will want to monitor is are all connection
 from subnet X using SSL? which is most easily answered by joining
 pg_stat_activity and pg_stat_ssl.
 
 If we force users to use superuser privs to find this out, then we're
 encouraging them to run monitoring as superuser, which is something we
 want to get *away* from.

I agree with all of this, but I'm worried that if we make it available
now then we may not be able to hide it later, even once we have the
monitoring role defined, because of backwards compatibility concerns.

If we aren't worried about that, then perhaps we can leave it less
strict for 9.5 and then make it stricter for 9.6.

 I'd be OK with concealing some columns:
 
 postgres=# select * from pg_stat_ssl;
  pid | ssl | version |   cipher| bits | compression
 | clientdn
 -+-+-+-+--+-+--
   37 | t   | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 |  256 | f   |
 
 I can see NULLifying cipher and DN columns.  The other columns, it's
 hard to imagine what use an attacker could put them to that they
 wouldn't be able to find out the same information easily using other routes.

Perhaps not, but I'm not sure how useful those columns would be to a
monitoring system either..  I'd rather keep it simple.

Thanks!

Stephen



signature.asc
Description: Digital signature


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Claudio Freire
On Tue, Jul 7, 2015 at 2:24 PM, Stephen Frost sfr...@snowman.net wrote:
 * Claudio Freire (klaussfre...@gmail.com) wrote:
 On Tue, Jul 7, 2015 at 12:34 PM, Stephen Frost sfr...@snowman.net wrote:
  * Heikki Linnakangas (hlinn...@iki.fi) wrote:
  On 07/07/2015 04:31 PM, Stephen Frost wrote:
  The alternative is to have monitoring tools which are running as
  superuser, which, in my view at least, is far worse.
 
  Or don't enable fpw_compression for tables where the information
  leak is a problem.
 
  My hope would be that we would enable FPW compression by default for
  everyone as a nice optimization.  Relegating it to a risky option which
  has to be tweaked on a per-table basis, but only for those tables where
  you don't mind the risk, makes a nice optimization nearly unusable for
  many environments.

 No, only tables that have RLS (or the equivalent, like in the case of
 pg_authid), where the leak may be meaningful.

 The attack requires control over an adjacent (same page) row, but not
 over the row being attacked. That's RLS.

 Eh?  I don't recall Heikki's attack requiring RLS and what about
 column-level privilege cases where you have access to the row but not to
 one of the columns?

That's right, column-level too.

Heiki's change password step requires something very similar to RLS
where roles can only update their own row. pg_authid also has
column-level stuff, where you only see your own hashed password, and
that too may make the attack useful. In the absence of row or
column-level privileges, however, the attack is unnecessary, and FPW
compression can be applied liberally.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving log capture of TAP tests with IPC::Run

2015-07-07 Thread Heikki Linnakangas

On 06/25/2015 07:14 AM, Michael Paquier wrote:

After looking at the issues with the TAP test suite that hamster faced
a couple of days ago, which is what has been discussed on this thread:
http://www.postgresql.org/message-id/13002.1434307...@sss.pgh.pa.us

I have developed a patch to improve log capture of the TAP tests by
being able to collect stderr and stdout output of each command run in
the tests by using more extensively IPC::Run::run (instead of system()
that is not able to help much) that has already been sent on the
thread above.


This is a massive improvement to the usability of TAP tests. They are 
practically impossible to debug currently. Thanks for working on this!


The patch redirects the output of all system_or_bail commands to a log 
file. That's a good start, but I wish we could get *everything* in the 
same log file. That includes also:


* stdout and stderr of *all* commands. Including all the commands run 
with command_ok/command_fails.


* the command line of commands being executed. It's difficult to follow 
the log when you don't know which output corresponds to which command.


* whenever a test case is reported as success/fail.

Looking at the manual page of Test::More, it looks like you could change 
where the perl script's STDOUT and STDERR point to, because Test::More 
takes a copy of them (when? at program startup I guess..). That would be 
much more convenient than decorating every run call with  logfile.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving regression tests to check LOCK TABLE and table permissions

2015-07-07 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/11/2015 07:52 PM, Michael Paquier wrote:
 Hi all,
 
 As mentioned in this thread, it would be good to have regression
 tests to test the interactions with permissions and LOCK TABLE: 
 http://www.postgresql.org/message-id/20150511195335.ge30...@tamriel.sn
owman.net

 
Attached is a patch achieving that.
 Note that it does some checks on the modes SHARE ACCESS, ROW
 EXCLUSIVE and ACCESS EXCLUSIVE to check all the code paths of 
 LockTableAclCheck@lockcmds.c.
 
 I'll add an entry in the next CF to keep track of it. Regards,

Committed and pushed to master and 9.5

Joe

- -- 
Joe Conway
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVnEcNAAoJEDfy90M199hlqTwP/0w87jIaAiJ86w+B72w24InP
77HYMqHd/6IB7cx6JWvDxeYZB0UN0h66A6z7TxaRyVCGM3m2ak73GwH+hj23TO9p
xCn94ZAFN4jfnFoiHAHQqThMlschbGFpvDbSxDyCbRyMV0t9ztpg+/bE/9/QZgg/
NzyhcjKQZZLhzMLDZva5i9jov8wi+cyVjYN2RT2I5+d7Sslrmz0QvOt8lCLMT6Mo
RQAMSy7m23SWCPjehDUhfaCvPu/Ur9E5PQx0JrJeSWGuJLbJ2Y700y7jstZYUgt9
96CmSJ1W/72deIzBWunf1eDFpLXqk3zn6Yi1K/wrGJwHDm7kfgqoSm5UsV9UYaE6
FUoPm3W2cqPXgOAzDJCfqS3mt7FOrYJ8dq+CsWK+eRRGmsjiOuNqu0YSAqC3rKUi
+GtBBXbaghm6+qLXi/ZSjfUdSq49Mj8jTMlWIcCxNWm7NV9lrXGUwRhCv97TrRoR
0Kl/PGL5Rsi9df2ck1VahEmIh5Ad+54I6On/0nZiq6pp42i7ZlrS1sA/kQbVLLVG
a1GPlXvN0pj8IGNyc2+FKdPBqTFrqp2Gcq2G4QfWWf5gCeTTyLKVtXPO8EcyJSGI
0Us+ELyW8IIBqCz/Rxh9T4NTPTsSlJbdpW8/vT9dY5z2rTR6IH11l+QQ2DOFDuM4
ehy/f/tjsT3u/VJIX79E
=3hyl
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-07 Thread Pavel Stehule
2015-07-07 18:15 GMT+02:00 Merlin Moncure mmonc...@gmail.com:

 On Tue, Jul 7, 2015 at 9:04 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  It doesn't have to if the behavior is guarded with a GUC.  I just
  don't understand what all the fuss is about.  The default behavior of
  logging that is well established by other languages (for example java)
  that manage error stack for you should be to:
 
  *) Give stack trace when an uncaught exception is thrown
  *) Do not give stack trace in all other logging cases unless asked for
 
  what is RAISE EXCEPTION - first or second case?

 First: RAISE (unless caught) is no different than any other kind of error.

 On Tue, Jul 7, 2015 at 9:42 AM, Heikki Linnakangas hlinn...@iki.fi
 wrote:
  On 07/07/2015 04:56 PM, Merlin Moncure wrote:
  It doesn't have to if the behavior is guarded with a GUC.  I just
  don't understand what all the fuss is about.  The default behavior of
  logging that is well established by other languages (for example java)
  that manage error stack for you should be to:
 
  *) Give stack trace when an uncaught exception is thrown
  *) Do not give stack trace in all other logging cases unless asked for
 
  Java's exception handling is so different from PostgreSQL's errors that I
  don't think there's much to be learned from that. But I'll bite:
 
  First of all, Java's exceptions always contain a stack trace. It's up to
 you
  when you catch an exception to decide whether to print it or not. try {
 ...
  } catch (Exception e) { e.printStackTrace() } is fairly common,
 actually.
  There is nothing like a NOTICE in Java, i.e. an exception that's thrown
 but
  doesn't affect the control flow. The best I can think of is
  System.out.println(), which of course has no stack trace attached to it.

 exactly.

  Perhaps you're arguing that NOTICE is more like printing to stderr, and
  should never contain any context information. I don't think that would
 be an
  improvement. It's very handy to have the context information available if
  don't know where a NOTICE is coming from, even if in most cases you're
 not
  interested in it.

 That's exactly what I'm arguing.  NOTICE (and WARNING) are for
 printing out information to client side logging; it's really the only
 tool we have for that purpose and it fits that role perfectly.  Of
 course, you may want to have NOTICE print context, especially when
 debugging, but some control over that would be nice and in most cases
 it's really not necessary.  I really don't understand the objection to
 offering control over that behavior although I certainly understand
 wanting to keep the default behavior as it currently is.

  This is really quite different from a programming language's exception
  handling. First, there's a server, which produces the errors, and a
 separate
  client, which displays them. You cannot catch an exception in the client.
 
  BTW, let me throw in one use case to consider. We've been talking about
  psql, and what to print, but imagine a more sophisticated client like
  pgAdmin. It's not limited to either printing the context or not. It could
  e.g. hide the context information of all messages when they occur, but if
  you double-click on it, it's expanded to show all the context, location
 and
  all. You can't do that if the server doesn't send the context
 information in
  the first place.
 
  I would be happy to show you the psql redirected output logs from my
  nightly server processes that spew into the megabytes because of
  logging various high level steps (did this, did that).
 
  Oh, I believe you. I understand what the problem is, we're only talking
  about how best to address it.

 Yeah.  For posterity, a psql based solution would work fine for me,
 but a server side solution has a lot of advantages (less protocol
 chatter, more configurability, keeping libpq/psql light).


I prefer a server side solution too. With it I can have (as plpgsql
developer) bigger control of expected output.

Client can change this behave on global (min_context) or on language level
(plpgsql.min_context). If somebody afraid about security, we can to enforce
rule so min_context = error always.

The possibility to enable or disable context per any RAISE statement is
nice to have, but it is not fundamental.

Other variant is a implementation of min_context on client side -  but then
we cannot to ensure current behave and fix plpgsql raise exception issue
together.

Pavel



 merlin



Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Heikki Linnakangas

On 07/07/2015 04:31 PM, Stephen Frost wrote:

* Heikki Linnakangas (hlinn...@iki.fi) wrote:

On 07/07/2015 04:15 PM, Stephen Frost wrote:

* Fujii Masao (masao.fu...@gmail.com) wrote:

On Thu, Apr 16, 2015 at 4:26 PM, Michael Paquier
michael.paqu...@gmail.com wrote:

+ the compression ratio of a full page image gives a hint of what is
+ the existing data of this page.  Tables that contain sensitive
+ information like structnamepg_authid/structname with password
+ data could be potential targets to such attacks. Note that as a
+ prerequisite a user needs to be able to insert data on the same page
+ as the data targeted and need to be able to detect checkpoint
+ presence to find out if a compressed full page write is included in
+ WAL to calculate the compression ratio of a page using WAL positions
+ before and after inserting data on the page with data targeted.
+/para
+   /warning


I think that, in addition to the description of the risk, it's better to
say that this vulnerability is cumbersome to exploit in practice.

Attached is the updated version of the patch. Comments?


Personally, I don't like simply documenting this issue.  I'd much rather
we restrict the WAL info as it's really got no business being available
to the general public.  I'd be fine with restricting that information to
superusers when compression is enabled, or always, for 9.5 and then
fixing it properly in 9.6 by allowing it to be visible to a
pg_monitoring default role which admins can then grant to users who
should be able to see it.


Well, then you could still launch the same attack if you have the
pg_monitoring privileges. So it would be more like a monitoring and
grab everyone's passwords privilege ;-). Ok, in seriousness the
attack isn't that easy to perform, but I still wouldn't feel
comfortable giving that privilege to anyone who isn't a superuser
anyway.


The alternative is to have monitoring tools which are running as
superuser, which, in my view at least, is far worse.


Or don't enable fpw_compression for tables where the information leak is 
a problem.



Yes, we'll get flak from people who are running with non-superuser
monitoring tools today, but we already have a bunch of superuser-only
things that monioring tools want, so this doesn't move the needle
particularly far, in my view.


That would be a major drawback IMHO, and a step in the wrong direction.


I'm not following.  If we don't want the information to be available to
everyone then we need to limit it, and right now the only way to do that
is to restrict it to superuser because we haven't got anything more
granular.

In other words, it seems like your above response about not wanting this
to be available to anyone except superusers is counter to this last
sentence where you seem to be saying we should continue to have the
information available to everyone and simply document that there's a
risk there as in the proposed patch.


I don't think we can or should try to hide the current WAL location. At 
least not until we have a monitoring role separate from superuserness.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Tom Lane
Oskari Saarenmaa o...@ohmu.fi writes:
 07.07.2015, 14:21, Andres Freund kirjoitti:
 Those seem to indicate something going seriously wrong to me.

 Binturong and Dingo run on the same host with a hourly cronjob to
 trigger the builds.  These failures are caused by concurrent test runs
 on different branches which use the same tmp_check directory for
 pg_upgrade tests, see
 http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dingodt=2015-07-07%2002%3A58%3A01stg=check-pg_upgrade

Ouch.

 It looks like neither make (GNU Make 4.0) nor shell (default Solaris
 /bin/sh) updates $PWD to point to the current directory where test.sh is
 executed and test.sh puts the test cluster in the original working
 directory of the process that launched make.

Double ouch.  It's the responsibility of the shell, not gmake, that PWD
reflect reality.  POSIX 2008, under Shell Variables, quoth as follows:

PWD
  Set by the shell and by the cd utility. In the shell the value shall be
  initialized from the environment as follows. If a value for PWD is
  passed to the shell in the environment when it is executed, the value is
  an absolute pathname of the current working directory that is no longer
  than {PATH_MAX} bytes including the terminating null byte, and the value
  does not contain any components that are dot or dot-dot, then the shell
  shall set PWD to the value from the environment. Otherwise, if a value
  for PWD is passed to the shell in the environment when it is executed,
  the value is an absolute pathname of the current working directory, and
  the value does not contain any components that are dot or dot-dot, then
  it is unspecified whether the shell sets PWD to the value from the
  environment or sets PWD to the pathname that would be output by pwd
  -P. Otherwise, the sh utility sets PWD to the pathname that would be
  output by pwd -P. In cases where PWD is set to the value from the
  environment, the value can contain components that refer to files of
  type symbolic link. In cases where PWD is set to the pathname that would
  be output by pwd -P, if there is insufficient permission on the current
  working directory, or on any parent of that directory, to determine what
  that pathname would be, the value of PWD is unspecified. Assignments to
  this variable may be ignored. If an application sets or unsets the value
  of PWD, the behaviors of the cd and pwd utilities are unspecified.

On the other hand, there is no text at all about PWD in the predecessor
Single Unix Spec v2, which is what we frequently regard as our minimum
baseline.  So one could argue that the Solaris shell you're using is
a valid implementation of SUS v2.

 I've restricted builds to one at a time on that host to work around this
 issue for now.  Also attached a patch to explicitly set PWD=$(CURDIR) in
 the Makefile to make sure test.sh runs with the right directory.

Given the last sentence in the POSIX 2008 text, I think unconditionally
munging PWD as you're proposing is a bit risky.  What I suggest is that
we add code to set PWD only if it's not set, which is most easily done
in test.sh itself, along the lines of

# Very old shells may not set PWD for us.
if [ x$PWD = x ]; then
  PWD=`pwd -P`
fi

A quick look around says that pg_upgrade/test.sh is the only place where
we're depending on shell PWD, so we only need to fix this one script.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] correct the initdb.log path for pg_regress

2015-07-07 Thread David Christensen
When encountering an initdb failure in pg_regress, we were displaying the 
incorrect path to the log file; this commit fixes all 3 places this could occur.



0001-Output-the-correct-path-for-initdb.log-in-pg_regress.patch
Description: Binary data



--
David Christensen
PostgreSQL Team Manager
End Point Corporation
da...@endpoint.com
785-727-1171






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Tom Lane
I wrote:
 Given the last sentence in the POSIX 2008 text, I think unconditionally
 munging PWD as you're proposing is a bit risky.  What I suggest is that
 we add code to set PWD only if it's not set, which is most easily done
 in test.sh itself, along the lines of

   # Very old shells may not set PWD for us.
   if [ x$PWD = x ]; then
 PWD=`pwd -P`
   fi

Oh, wait, scratch that: the build logs you showed clearly indicate that
the test is running with temp_root set to
/export/home/pgfarmer/build-farm/tmp_check
which implies that PWD was not empty but /export/home/pgfarmer/build-farm.
So the above wouldn't fix it.

A likely hypothesis is that the buildfarm script was invoked using some
modern shell that did set PWD, but then test.sh is being executed (in a
much lower directory) by some SUSv2-era shell that doesn't.

I'm still kind of afraid to explicitly change PWD in a modern shell,
though.  Perhaps the right thing is just not to rely on PWD at all
in test.sh, but replace $PWD with `pwd -P`.  (I did check that this
utility is required by SUSv2.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Simon Riggs
On 7 July 2015 at 15:18, Sawada Masahiko sawada.m...@gmail.com wrote:


  I would also like to see the visibilitymap_test function exposed in SQL,
  so we can write code to examine the map contents for particular ctids.
  By doing that we can then write a formal test that shows the evolution
 of tuples from insertion,
  vacuuming and freezing, testing the map has been set correctly at each
 stage.
  I guess that needs to be done as an isolationtest so we have an observer
 that contrains the xmin in various ways.
  In light of multixact bugs, any code that changes the on-disk tuple
 metadata needs formal tests.

 Attached patch adds a few function to contrib/pg_freespacemap to
 explore the inside of visibility map, which I used for my test.
 I hope it helps for testing this feature.


I don't think pg_freespacemap is the right place.

I'd prefer to add that as a single function into core, so we can write
formal tests. I would not personally commit this feature without rigorous
and easily repeatable verification.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] New functions

2015-07-07 Thread Michael Paquier
On Mon, Mar 23, 2015 at 12:46 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Mar 23, 2015 at 1:48 AM, Воронин Дмитрий
 carriingfat...@yandex.ru wrote:

  Please, attach new version of my patch to commitfest page.

 Michael, I found a way to attach patch. sorry to trouble.

 Cool. Thanks. I am seeing your patch entry here:
 https://commitfest.postgresql.org/5/192/
 I'll try to take a look at it for the next commit fest, but please do
 not expect immediate feedback things are getting wrapped up for 9.5.

OK, so I have looked at this patch in more details. And here are some comments:
1) As this is an upgrade to sslinfo 1.1, sslinfo--1.0.sql is not necessary.
2) contrib/sslinfo/Makefile needs to be updated with
sslinfo--1.0--1.1.sql and sslinfo--1.1.sql.
3) This return type is not necessary:
+ CREATE TYPE extension AS (
+ name text,
+ value text
+ );
+
+ CREATE OR REPLACE FUNCTION ssl_extension_names() RETURNS SETOF extension
+ AS 'MODULE_PATHNAME', 'ssl_extension_names'
+ LANGUAGE C STRICT;
Instead, I think that we should make ssl_extension_names return a
SETOF record with some OUT parameters. Also, using a tuple descriptor
saved in the user context would bring more readability.
4) sslinfo.control needs to be updated to 1.1.
5) I think that it is better to return an empty result should no
client certificate be present or should ssl be disabled for a given
connection. And the patch failed to do that with SRF_RETURN_DONE.
6) The code is critically lacking comments, and contains many typos.
7) The return status of get_extention() is not necessary. All the code
paths calling it simply ERROR should the status be false. It is better
to move the error message directly in the function and remove the
status code.
8) As proposed, the patch adds 3 new functions:
ssl_extension_is_critical, ssl_extension_value and
ssl_extension_names. But actually I am wondering why
ssl_extension_is_critical and ssl_extension_value are actually useful.
I mean, what counts is the extension information about the existing
client certificate, no? Hence I think that we should remove
ssl_extension_is_critical and ssl_extension_value, and extend
ssl_extension_names with a new boolean flag is_critical to determine
if a extension given is critical or not. Let's rename
ssl_extension_names to ssl_extension_info at the same time.
get_extension is not needed anymore with that as well.

Speaking of which, I have rewritten the patch as attached. This looks
way cleaner than the previous version submitted. Dmitry, does that
look fine for you?
I am switching this patch as Waiting on Author.
Regards,
-- 
Michael
From 6ab5ec9ea6705eaf196b07e76b04a78fa0a0f220 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 7 Jul 2015 23:01:24 +0900
Subject: [PATCH] Add ssl_extension_info in sslinfo

This new function provides information about extension in client
certificates: extension name, extension value and critical status.
---
 contrib/sslinfo/Makefile   |   3 +-
 contrib/sslinfo/sslinfo--1.0--1.1.sql  |  13 ++
 .../sslinfo/{sslinfo--1.0.sql = sslinfo--1.1.sql} |  12 +-
 contrib/sslinfo/sslinfo.c  | 167 -
 contrib/sslinfo/sslinfo.control|   2 +-
 doc/src/sgml/sslinfo.sgml  |  19 +++
 6 files changed, 210 insertions(+), 6 deletions(-)
 create mode 100644 contrib/sslinfo/sslinfo--1.0--1.1.sql
 rename contrib/sslinfo/{sslinfo--1.0.sql = sslinfo--1.1.sql} (81%)

diff --git a/contrib/sslinfo/Makefile b/contrib/sslinfo/Makefile
index 86cbf05..f6c1472 100644
--- a/contrib/sslinfo/Makefile
+++ b/contrib/sslinfo/Makefile
@@ -4,7 +4,8 @@ MODULE_big = sslinfo
 OBJS = sslinfo.o $(WIN32RES)
 
 EXTENSION = sslinfo
-DATA = sslinfo--1.0.sql sslinfo--unpackaged--1.0.sql
+DATA = sslinfo--1.0--1.1.sql sslinfo--1.1.sql \
+	sslinfo--unpackaged--1.0.sql
 PGFILEDESC = sslinfo - information about client SSL certificate
 
 ifdef USE_PGXS
diff --git a/contrib/sslinfo/sslinfo--1.0--1.1.sql b/contrib/sslinfo/sslinfo--1.0--1.1.sql
new file mode 100644
index 000..c98a3ae
--- /dev/null
+++ b/contrib/sslinfo/sslinfo--1.0--1.1.sql
@@ -0,0 +1,13 @@
+/* contrib/sslinfo/sslinfo--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use ALTER EXTENSION sslinfo UPDATE TO '1.1' to load this file. \quit
+
+CREATE OR REPLACE FUNCTION ssl_extension_info(
+OUT name text,
+OUT value text,
+OUT iscritical boolean
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'ssl_extension_info'
+LANGUAGE C STRICT;
diff --git a/contrib/sslinfo/sslinfo--1.0.sql b/contrib/sslinfo/sslinfo--1.1.sql
similarity index 81%
rename from contrib/sslinfo/sslinfo--1.0.sql
rename to contrib/sslinfo/sslinfo--1.1.sql
index 79ce656..d63ddd5 100644
--- a/contrib/sslinfo/sslinfo--1.0.sql
+++ b/contrib/sslinfo/sslinfo--1.1.sql
@@ -1,4 +1,4 @@
-/* contrib/sslinfo/sslinfo--1.0.sql */
+/* 

[HACKERS] Missing latex-longtable value

2015-07-07 Thread Bruce Momjian
I have discovered that psql \pset format does not display
latex-longtable as a valid value, e.g.:

test= \pset format kjasdf
\pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
latex, troff-ms

With the attached patch, the latex-longtable value is properly displayed:

test= \pset format kjasdf
\pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, 
latex, latex-longtable, troff-ms

Should this be fixed in 9.6 only or 9.5 too?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index 2728216..5eb1e88
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** do_pset(const char *param, const char *v
*** 2484,2490 
  			popt-topt.format = PRINT_TROFF_MS;
  		else
  		{
! 			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, latex, troff-ms\n);
  			return false;
  		}
  
--- 2484,2490 
  			popt-topt.format = PRINT_TROFF_MS;
  		else
  		{
! 			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n);
  			return false;
  		}
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Stephen Frost
* Heikki Linnakangas (hlinn...@iki.fi) wrote:
 On 07/07/2015 04:31 PM, Stephen Frost wrote:
 The alternative is to have monitoring tools which are running as
 superuser, which, in my view at least, is far worse.
 
 Or don't enable fpw_compression for tables where the information
 leak is a problem.

My hope would be that we would enable FPW compression by default for
everyone as a nice optimization.  Relegating it to a risky option which
has to be tweaked on a per-table basis, but only for those tables where
you don't mind the risk, makes a nice optimization nearly unusable for
many environments.

 I'm not following.  If we don't want the information to be available to
 everyone then we need to limit it, and right now the only way to do that
 is to restrict it to superuser because we haven't got anything more
 granular.
 
 In other words, it seems like your above response about not wanting this
 to be available to anyone except superusers is counter to this last
 sentence where you seem to be saying we should continue to have the
 information available to everyone and simply document that there's a
 risk there as in the proposed patch.
 
 I don't think we can or should try to hide the current WAL location.
 At least not until we have a monitoring role separate from
 superuserness.

I'd rather we hide it now, to allow FPW compression to be enabled for
everyone, except those few environments where it ends up making things
worse, and then provide the monitoring role in 9.6 which addresses this
and various other pieces that are currently superuser-only.  We could
wait, but I think we'd end up discouraging people from using the option
because of the caveat and we'd then spend years undoing that in the
future.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving regression tests to check LOCK TABLE and table permissions

2015-07-07 Thread Michael Paquier
On Wed, Jul 8, 2015 at 6:39 AM, Joe Conway wrote:
 Hash: SHA1
 Committed and pushed to master and 9.5

Thanks, Joe!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-07-07 Thread Michael Paquier
On Wed, Jul 8, 2015 at 3:29 AM, Stephen Frost sfr...@snowman.net wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 On 07/07/2015 09:06 AM, Magnus Hagander wrote:
 
  To make it accessible to monitoring systems that don't run as superuser
  (which should be most monitoring systems, but we have other cases making
  that hard as has already been mentioned upthread).
 
  I'm having a hard time trying to figure out a consensus in this thread.
  I think there are slightly more arguments for limiting the access though.
 
  The question then is, if we want to hide everything, do we care about
  doing the NULL dance, or should we just throw an error for
  non-superusers trying to access it?

 I'm going to vote against blocking the entire view for non-superusers.
 One of the things people will want to monitor is are all connection
 from subnet X using SSL? which is most easily answered by joining
 pg_stat_activity and pg_stat_ssl.

 If we force users to use superuser privs to find this out, then we're
 encouraging them to run monitoring as superuser, which is something we
 want to get *away* from.

 I agree with all of this, but I'm worried that if we make it available
 now then we may not be able to hide it later, even once we have the
 monitoring role defined, because of backwards compatibility concerns.

 If we aren't worried about that, then perhaps we can leave it less
 strict for 9.5 and then make it stricter for 9.6.

Agreed. It is better to make things strict first and relax afterwards
from the user prospective, so I would vote for something in this
direction. We could relax it with this monitoring ACL afterwards in
9.6, still what I think we are missing here are reactions from the
field, and I suspect that taking the most careful approach (hiding a
maximum of fields to non-authorized users) will pay better in the long
term. I am also suspecting that if we let it as-is cloud deployments
of Postgres (Heroku for example) are going to patch this view with ACL
checks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-07-07 Thread David Rowley
On 8 July 2015 at 02:00, Alexander Korotkov a.korot...@postgrespro.ru
wrote:


 Patch doesn't apply to current master. Could you, please, rebase it?


Attached. Thanks.

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


unique_joins_9e6d4e4_2015-07-08.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: incomplete implementation of errhidecontext

2015-07-07 Thread Andres Freund
On 2015-07-03 06:20:14 +0200, Pavel Stehule wrote:
 I would to use it for controlling (enabling, disabling) CONTEXT in RAISE
 statement in plpgsql. I am thinking so one option for this purpose is
 enough, and I would not to add other option to specify LOG, CLIENT.

I don't think a plpgsql function should be able to suppress all
context. From a security/debuggability POV that's a bad idea. The
context messages are the only way right now to have any chance of
tracing back what caused an error in a function because log_statements et
al. will not show it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Thu, Apr 16, 2015 at 4:26 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 9:42 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 9:20 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 2:22 PM, Fujii Masao wrote:
 On Wed, Apr 15, 2015 at 11:55 AM, Michael Paquier wrote:
 1) Doc patch to mention that it is possible that compression can give
 hints to attackers when working on sensible fields that have a
 non-fixed size.

 I think that this patch is enough as the first step.

 I'll get something done for that at least, a big warning below the
 description of wal_compression would do it.

 So here is a patch for this purpose, with the following text being used:

Thanks for the patch!

 +   warning
 +para
 + When enabling varnamewal_compression/varname, there is a risk
 + to leak data similarly to the BREACH and CRIME attacks on SSL where

Isn't it better to add the link to the corresponding wikipedia page for
the terms BREACH and CRIME?


 + the compression ratio of a full page image gives a hint of what is
 + the existing data of this page.  Tables that contain sensitive
 + information like structnamepg_authid/structname with password
 + data could be potential targets to such attacks. Note that as a
 + prerequisite a user needs to be able to insert data on the same page
 + as the data targeted and need to be able to detect checkpoint
 + presence to find out if a compressed full page write is included in
 + WAL to calculate the compression ratio of a page using WAL positions
 + before and after inserting data on the page with data targeted.
 +/para
 +   /warning

I think that, in addition to the description of the risk, it's better to
say that this vulnerability is cumbersome to exploit in practice.

Attached is the updated version of the patch. Comments?

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2311,2316  include_dir 'conf.d'
--- 2311,2339 
  but at the cost of some extra CPU spent on the compression during
  WAL logging and on the decompression during WAL replay.
 /para
+ 
+warning
+ para
+  When enabling varnamewal_compression/varname, there is a risk
+  to leak data similarly to the
+  ulink url=http://en.wikipedia.org/wiki/BREACH_%28security_exploit%29;
+  BREACH/ulink and ulink url=http://en.wikipedia.org/wiki/CRIME;
+  CRIME/ulink attacks on SSL where the compression ratio of a full
+  page image gives a hint of what is the existing data of this page.
+  Tables that contain sensitive information like
+  structnamepg_authid/structname with password data could be
+  potential targets to such attacks. Such attacks are less likely to
+  occur in practice because an attacker has to be able to insert data
+  on the same page as the data targeted, to detect occurrences of
+  checkpoint, and to calculate the compression ratio of a full page
+  image by accessing to information on WAL position. Also an attacker
+  has to repeat many times the attempts to obtain the hint of data
+  from the compression ratio, which would take a long time.
+  Therefore this vulnerability is practically quite cumbersome to
+  exploit, but it's doable. If this risk of data leakage should be
+  avoided, varnamewal_compression/varname needs to be disabled.
+ /para
+/warning
/listitem
   /varlistentry
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Sat, May 30, 2015 at 4:58 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 4:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 9:42 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 9:20 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 2:22 PM, Fujii Masao wrote:
 On Wed, Apr 15, 2015 at 11:55 AM, Michael Paquier wrote:
 1) Doc patch to mention that it is possible that compression can give
 hints to attackers when working on sensible fields that have a
 non-fixed size.

 I think that this patch is enough as the first step.

 I'll get something done for that at least, a big warning below the
 description of wal_compression would do it.

 So here is a patch for this purpose, with the following text being used:
 +   warning
 +para
 + When enabling varnamewal_compression/varname, there is a risk
 + to leak data similarly to the BREACH and CRIME attacks on SSL where
 + the compression ratio of a full page image gives a hint of what is
 + the existing data of this page.  Tables that contain sensitive
 + information like structnamepg_authid/structname with password
 + data could be potential targets to such attacks. Note that as a
 + prerequisite a user needs to be able to insert data on the same 
 page
 + as the data targeted and need to be able to detect checkpoint
 + presence to find out if a compressed full page write is included in
 + WAL to calculate the compression ratio of a page using WAL 
 positions
 + before and after inserting data on the page with data targeted.
 +/para
 +   /warning

 Comments and reformulations are welcome.

 To make things on this thread move on, I just wanted to add that we
 should make wal_compression SUSET

I'm OK to make it SUSET.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Amit Kapila
On Thu, Jul 2, 2015 at 9:00 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:


 Thank you for bug report, and comments.

 Fixed version is attached, and source code comment is also updated.
 Please review it.


I am looking into this patch and would like to share my findings with
you:

1.
@@ -2131,8 +2133,9 @@ heap_insert(Relation relation, HeapTuple tup,
CommandId cid,

CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);

  /*
- * Find buffer to insert this
tuple into.  If the page is all visible,
- * this will also pin the requisite visibility map page.
+
 * Find buffer to insert this tuple into.  If the page is all visible
+ * of all frozen, this will also pin
the requisite visibility map and
+ * frozen map page.
  */

typo in comments.

/of all frozen/or all frozen

2.
visibilitymap.c
+ * The visibility map is a bitmap with two bits (all-visible and all-frozen
+ * per heap page.

/and all-frozen/and all-frozen)
closing round bracket is missing.

3.
visibilitymap.c
-/*#define TRACE_VISIBILITYMAP */
+#define TRACE_VISIBILITYMAP

why is this hash define opened?

4.
-visibilitymap_count(Relation rel)
+visibilitymap_count(Relation rel, bool for_visible)

This API needs to count set bits for either visibility info, frozen info
or both (if required), it seems better to have second parameter as
uint8 flags rather than bool. Also, if it is required to be called at most
places for both visibility and frozen bits count, why not get them
in one call?

5.
Clearing visibility and frozen bit separately for the dml
operations would lead locking/unlocking the corresponding buffer
twice, can we do it as a one operation.  I think this is suggested
by Simon as well.

6.
- * Before locking the buffer, pin the visibility map page if it appears to
- * be necessary.
Since we haven't got the lock yet, someone else might be
+ * Before locking the buffer, pin the
visibility map if it appears to be
+ * necessary.  Since we haven't got the lock yet, someone else might
be

Why you have deleted 'page' in above comment?

7.
@@ -3490,21 +3532,23 @@ l2:
  UnlockTupleTuplock(relation, (oldtup.t_self), *lockmode);

if (vmbuffer != InvalidBuffer)
  ReleaseBuffer(vmbuffer);
+
  bms_free
(hot_attrs);

Seems unnecessary change.

8.
@@ -1919,11 +1919,18 @@ index_update_stats(Relation rel,
  {
  BlockNumber relpages =
RelationGetNumberOfBlocks(rel);
  BlockNumber relallvisible;
+ BlockNumber
relallfrozen;

  if (rd_rel-relkind != RELKIND_INDEX)
- relallvisible =
visibilitymap_count(rel);
+ {
+ relallvisible = visibilitymap_count(rel,
true);
+ relallfrozen = visibilitymap_count(rel, false);
+ }
  else
/* don't bother for indexes */
+ {
  relallvisible = 0;
+
relallfrozen = 0;
+ }

I think in this function, you have forgotten to update the
relallfrozen value in pg_class.

9.
vacuumlazy.c

@@ -253,14 +258,16 @@ lazy_vacuum_rel(Relation onerel, int options,
VacuumParams *params,
  * NB: We
need to check this before truncating the relation, because that
  * will change -rel_pages.
  */
-
if (vacrelstats-scanned_pages  vacrelstats-rel_pages)
+ if ((vacrelstats-scanned_pages +
vacrelstats-vmskipped_pages)
+  vacrelstats-rel_pages)
  {
- Assert(!scan_all);

Why you have removed this Assert, won't the count of
vacrelstats-scanned_pages + vacrelstats-vmskipped_pages be
equal to vacrelstats-rel_pages when scall_all = true.

10.
vacuumlazy.c
lazy_vacuum_rel()
..
+ scanned_all |= scan_all;
+

Why this new assignment is added, please add a comment to
explain it.

11.
lazy_scan_heap()
..
+ * Also, skipping even a single page accorind to all-visible bit of
+ * visibility map means that we can't update relfrozenxid, so we only want
+ * to do it if we can skip a goodly number. On the other hand, we count
+ * both how many pages we skipped according to all-frozen bit of visibility
+ * map and how many pages we freeze page, so we can update relfrozenxid if
+ * the sum of their is as many as tuples per page.

a.
typo
/accorind/according
b.
is the second part of comment (starting from On the other hand)
right?  I mean you are comparing sum of pages skipped due to
all_frozen bit and number of pages freezed with tuples per page.
I don't understand how are they related?


12.
@@ -918,8 +954,13 @@ lazy_scan_heap(Relation onerel, LVRelStats
*vacrelstats,
  else
  {
  num_tuples += 1;
+ ntup_in_blk += 1;
  hastup = true;

+ /* If current tuple is already frozen, count it up */
+ if (HeapTupleHeaderXminFrozen(tuple.t_data))
+ already_nfrozen += 1;
+
  /*
  * Each non-removable tuple must be checked to see if it needs
  * freezing.  Note we already have exclusive buffer lock.

Here, if tuple is already_frozen, can't we just continue and
check for next tuple?

13.
+extern XLogRecPtr log_heap_frozenmap(RelFileNode rnode, Buffer heap_buffer,
+ Buffer fm_buffer);

It seems like this function is not used.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] replication slot restart_lsn initialization

2015-07-07 Thread Andres Freund
On 2015-06-10 13:13:41 -0700, Gurjeet Singh wrote:
  /*
 + * Grab and save an LSN value to prevent WAL recycling past that point.
 + */
 +void
 +ReplicationSlotRegisterRestartLSN()
 +{
 + ReplicationSlot *slot = MyReplicationSlot;
 +
 + Assert(slot != NULL);
 + Assert(slot-data.restart_lsn == InvalidXLogRecPtr);
 +
 + /*
 +  * The replication slot mechanism is used to prevent removal of required
 +  * WAL. As there is no interlock between this and checkpoints required, 
 WAL
 +  * segment could be removed before ReplicationSlotsComputeRequiredLSN() 
 has
 +  * been called to prevent that. In the very unlikely case that this 
 happens
 +  * we'll just retry.
 +  */
 + while (true)
 + {
 + XLogSegNo   segno;
 +
 + /*
 +  * Let's start with enough information if we can, so log a 
 standby
 +  * snapshot and start decoding at exactly that position.
 +  */
 + if (!RecoveryInProgress())
 + {
 + XLogRecPtr  flushptr;
 +
 + /* start at current insert position */
 + slot-data.restart_lsn = GetXLogInsertRecPtr();
 +
 + /*
 +  * Log an xid snapshot for logical replication. It's 
 not needed for
 +  * physical slots as it is done in BGWriter on a 
 regular basis.
 +  */
 + if (!slot-data.persistency == RS_PERSISTENT)
 + {
 + /* make sure we have enough information to 
 start */
 + flushptr = LogStandbySnapshot();
 +
 + /* and make sure it's fsynced to disk */
 + XLogFlush(flushptr);
 + }

Huh? The slot-data.persistency == RS_PERSISTENT check seems pretty much
entirely random to me. I mean physical slots can (and normally are)
persistent as well?  Instead check whether it's a database specifics lot.

The reasoning why it's not helpful for physical slots is wrong. The
point is that a standby snapshot at that location will simply not help
physical replication, it's only going to read ones after the location at
which the base backup starts (i.e. the location from the backup label).

  /* 
 - * Manipulation of ondisk state of replication slots
 + * Manipulation of on-disk state of replication slots
   *
   * NB: none of the routines below should take any notice whether a slot is 
 the
   * current one or not, that's all handled a layer above.
 diff --git a/src/backend/replication/slotfuncs.c 
 b/src/backend/replication/slotfuncs.c
 index 9a2793f..bd526fa 100644
 --- a/src/backend/replication/slotfuncs.c
 +++ b/src/backend/replication/slotfuncs.c
 @@ -40,6 +40,7 @@ Datum
  pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
  {
   Namename = PG_GETARG_NAME(0);
 + boolactivate = PG_GETARG_BOOL(1);

Don't like 'activate' much as a name. How about 'immediately_reserve'?

Other than that it's looking good to me.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comfortably check BackendPID with psql

2015-07-07 Thread Julien Rouhaud
Le 07/07/2015 13:41, Andres Freund a écrit :
 On 2015-07-05 14:11:38 +0200, Julien Rouhaud wrote:
 Tiny for me too, but I sometimes had the need.

 I can't really see any good reason not to add a %p escape to psql's
 PROMPT, so I'm attaching a simple patch to implement it. Unless someone
 objects, I'll add it to the next commitfest.
 
 Pushed the patch. I only made a minor belt-and-suspenders type of
 change, namely to check whether PQbackendPID() returns 0 and not print
 that and replaced PID by pid in the docs and comments.
 
 Thanks for the patch!
 

Thanks!

 Greetings,
 
 Andres Freund
 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory Accounting v11

2015-07-07 Thread Andres Freund
On 2015-07-07 16:49:58 +1200, David Rowley wrote:
 I've been looking at this patch and trying to reproduce the reported
 slowdown by using Tomas' function to try to exercise palloc() with minimal
 overhead of other code:
 
 https://github.com/tvondra/palloc_bench

That's not necessarily meaningful. Increased cache footprint (both data
and branch prediction) is often only noticeable with additional code
running pushing stuff out.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-07 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 7 Jul 2015 16:27:38 +0900, Fujii Masao masao.fu...@gmail.com wrote in 
CAHGQGwEJwov8YwvmbbWps3Rba6kF1yf7qL3S==Oy4D=gq9y...@mail.gmail.com
 Each backend reports its event when trying to take a lock. But
 the reported event is never reset until next event is reported.
 Is this OK? This means that the wait_event column keeps showing
 the *last* event while a backend is in idle state, for example.
 So, shouldn't we reset the reported event or report another one
 when releasing the lock?

It seems so but pg_stat_activity.waiting would indicate whether
the event is lasting. However, .waiting reflects only the status
of heavy-wait locks. It would be quite misleading.

I think that pg_stat_activity.wait_event sould be linked to
.waiting then .wait_event should be restricted to heavy wait
locks if the meaning of .waiting cannot not be changed.

On the other hand, we need to have as many wait events as
Itagaki-san's patch did so pg_stat_activity might be the wrong
place for full-spec wait_event.

 +read_string_from_waitevent(uint8 wait_event)
 
 The performance of this function looks poor because its worst case
 is O(n): n is the number of all the events that we are trying to track.
 Also in pg_stat_activity, this function is called per backend.
 Can't we retrieve the event name by using wait event ID as an index
 of WaitEventTypeMap array?

+1

regards,


At Tue, 7 Jul 2015 16:27:38 +0900, Fujii Masao masao.fu...@gmail.com wrote in 
CAHGQGwEJwov8YwvmbbWps3Rba6kF1yf7qL3S==Oy4D=gq9y...@mail.gmail.com
 On Tue, Jun 30, 2015 at 10:30 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  On Fri, Jun 26, 2015 at 6:26 PM, Robert Haas robertmh...@gmail.com wrote:
 
  On Thu, Jun 25, 2015 at 11:57 PM, Amit Kapila amit.kapil...@gmail.com
  wrote:
3. Add new view 'pg_stat_wait_event' with following info:
pid   - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait,
etc
wait_event - Lock (Relation), Lock (Relation Extension), etc
   I am pretty unconvinced that it's a good idea to try to split up the
   wait event into two columns.  I'm only proposing ~20 wait states, so
   there's something like 5 bits of information here.  Spreading that
   over two text columns is a lot, and note that Amit's text would
   basically recapitulate the contents of the first column in the second
   one, which I cannot get excited about.
   There is an advantage in splitting the columns which is if
   wait_event_type
   column indicates Heavy Weight Lock, then user can go and check further
   details in pg_locks, I think he can do that even by seeing wait_event
   column, but that might not be as straightforward as with wait_event_type
   column.
 
  It's just a matter of writing event_type LIKE 'Lock %' instead of
  event_type = 'Lock'.
 
 
  Yes that way it can be done and may be that is not inconvenient for user,
  but then there is other type of information which user might need like what
  distinct resources on which wait is possible, which again he can easily find
  with
  different event_type column. I think some more discussion is required before
  we
  could freeze the user interface for this feature, but in the meantime I have
  prepared an initial patch by adding a new column wait_event in
  pg_stat_activity.
  For now, I have added the support for Heavy-Weight locks, Light-Weight locks
  [1]
  and Buffer Cleanup Lock.  I could add for other types (spin lock delay
  sleep, IO,
  network IO, etc.) if there is no objection in the approach used in patch to
  implement
  this feature.
 
  [1] For LWLocks, currently I have used wait_event as OtherLock for locks
  other than NUM_FIXED_LWLOCKS (Refer function NumLWLocks to see all
  type of LWLocks).  The reason is that there is no straight forward way to
  get
  the id (lockid) of such locks as for some of those (like shared_buffers,
  MaxBackends) the number of locks will depend on run-time configuration
  parameters.  I think if we want to handle those then we could either do some
  math to find out the lockid based on runtime values of these parameters or
  we could add tag in LWLock structure (which indicates the lock type) and
  fill it during Lock initialization or may be some other better way to do it.
 
  I have still not added documentation and have not changed anything for
  waiting column in pg_stat_activity as I think before that we need to
  finalize
  the user interface.  Apart from that as mentioned above still wait for
  some event types (like IO, netwrok IO, etc.) is not added and also I think
  separate function/'s (like we have for existing ones
  pg_stat_get_backend_waiting)
  will be required which again depends upon user interface.
 
 Yes, we need to discuss what events to track. As I suggested upthread,
 Itagaki-san's patch would be helpful to think that. He proposed to track
 not only wait event like locking and I/O operation but also CPU 

Re: [HACKERS] Comfortably check BackendPID with psql

2015-07-07 Thread Andres Freund
On 2015-07-05 14:11:38 +0200, Julien Rouhaud wrote:
 Tiny for me too, but I sometimes had the need.
 
 I can't really see any good reason not to add a %p escape to psql's
 PROMPT, so I'm attaching a simple patch to implement it. Unless someone
 objects, I'll add it to the next commitfest.

Pushed the patch. I only made a minor belt-and-suspenders type of
change, namely to check whether PQbackendPID() returns 0 and not print
that and replaced PID by pid in the docs and comments.

Thanks for the patch!

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Simon Riggs
On 6 July 2015 at 17:28, Simon Riggs si...@2ndquadrant.com wrote:

I think we need something for pg_upgrade to rewrite existing VMs. Otherwise
 a large read only database would suddenly require a massive revacuum after
 upgrade, which seems bad. That can wait for now until we all agree this
 patch is sound.


Since we need to rewrite the vm map, I think we should call the new map
vfm

That way we will be able to easily check whether the rewrite has been
conducted on all relations.

Since the maps are just bits there is no other way to tell that a map has
been rewritten

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-07-07 Thread Andres Freund
On 2015-07-03 18:03:58 -0400, Tom Lane wrote:
 I have just looked through this thread, and TBH I think we should reject
 this patch altogether --- not RWF, but no we don't want this.  The
 use-case remains hypothetical: no performance numbers showing a real-world
 benefit have been exhibited AFAICS.

It's not that hard to imagine a performance benefit though? If the
toasted column is accessed infrequently/just after filtering on other
columns (not uncommon) it could very well be beneficial to put the main
table on fast storage (SSD) and the toast table on slow storage
(spinning rust).

I've seen humoungous toast tables that are barely ever read for tables
that are constantly a couple times in the field.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for CREATE SEQUENCE

2015-07-07 Thread Andres Freund
On 2015-06-19 06:41:19 +, Brendan Jurd wrote:
 I'm marking this Waiting on Author.  Once the problems have been
 corrected, it should be ready for a committer.

Vik, are you going to update the patch?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: incomplete implementation of errhidecontext

2015-07-07 Thread Pavel Stehule
2015-07-07 14:13 GMT+02:00 Andres Freund and...@anarazel.de:

 On 2015-07-03 06:20:14 +0200, Pavel Stehule wrote:
  I would to use it for controlling (enabling, disabling) CONTEXT in RAISE
  statement in plpgsql. I am thinking so one option for this purpose is
  enough, and I would not to add other option to specify LOG, CLIENT.

 I don't think a plpgsql function should be able to suppress all
 context. From a security/debuggability POV that's a bad idea. The
 context messages are the only way right now to have any chance of
 tracing back what caused an error in a function because log_statements et
 al. will not show it.


It does it now. The context is not raised for exception raised by RAISE
statement from PL/pgSQL - and I would to fix it. But sometimes the context
is useless - for NOTICE level for example. I seen a strange workarounds -
RAISE NOTIFY followed by PERFORM 10/0 to get a context from PLpgSQL call.

Pavel


Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-07-07 Thread Andres Freund
On 2015-06-19 17:21:25 +0200, Cédric Villemain wrote:
  To make slot usage in pg_receivexlog easier, should we add
  --create-slot-if-not-exists? That'd mean you could run the same command
  the first and later invocation.
 
 +1 (with a shorter name please, if you can find one... )

How about a seperate --if-not-exists modifier? Right now --create works
as an abbreviation for --create-slot, but --create-slot-if-not-exists
would break that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2015-07-07 Thread Heikki Linnakangas

On 05/16/2015 06:00 AM, Haribabu Kommi wrote:

Regarding next version- are you referring to 9.6 and therefore we
should go ahead and bounce this to the next CF, or were you planning to
post a next version of the patch today?

Yes, for 9.6 version.


No new patch emerged that could be reviewed in this commitfest (July), 
so I've marked this as Returned with feedback.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-07-07 Thread Etsuro Fujita

On 2015/07/06 9:42, Kouhei Kaigai wrote:

Also, I don't want to stick on the assumption that relations involved in
remote join are all managed by same foreign-server no longer.
The following two ideas introduce possible enhancement of remote join
feature that involved local relations; replicated table or transformed
to VALUES() clause.



I think add_paths_to_joinrel() is the best location for foreign-join,
not only custom-join. Relocation to standard_join_search() has larger
disadvantage than its advantage.



I agree with you that it's important to ensure the expandability, and my
question is, is it possible that the API call from standard_join_search
also realize those idea if FDWs can get the join information through the
root or something like that?



I don't deny its possibility, even though I once gave up to implement to
reproduce join information - which relations and other ones are joined in
this level - using PlannerInfo and RelOptInfo.


OK


However, we need to pay attention on advantages towards the alternatives.
Hooks on add_paths_to_joinrel() enables to implement identical stuff, with
less complicated logic to reproduce left / right relations from RelOptInfo
of the joinrel. (Note that RelOptInfo-fdw_private enables to avoid path-
construction multiple times.)
I'm uncertain why this API change is necessary to fix up the problem
around EvalPlanQual.


Yeah, maybe we wouldn't need any API change.  I think we would be able 
to fix this by complicating add_path as you pointed out upthread.  I'm 
not sure that complicating it is a good idea, though.  I think that it 
might be possible that the callback in standard_join_search would allow 
us to fix this without complicating the core path-cost-comparison stuff 
such as add_path.  I noticed that what I proposed upthread doesn't work 
properly, though.


Actually, I have another concern about the callback location that you 
proposed; that might meaninglessly increase planning time in the 
postgres_fdw case when using remote estimates, which the proposed 
postgres_fdw patch doesn't support currently IIUC, but I think it should 
support that.  Let me explain about that.  If you have A JOIN B JOIN C 
all on the same foreign server, for example, we'll have only to perform 
a remote EXPLAIN for A-B-C for the estimates (when adopting a strategy 
that we push down a join as large as possible into the remote server). 
However, ISTM that the callback in add_paths_to_joinrel would perform 
remote EXPLAINs not only for A-B-C but for A-B, A-C and B-C according to 
the dynamic programming algorithm.  (Duplicated remote EXPLAINs for 
A-B-C can be eliminated using a way you proposed.)  Thus the remote 
EXPLAINs for A-B, A-C and B-C seem to me meaningless while incurring 
performance degradation in query planning.  Maybe I'm missing something, 
though.


Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-07 Thread Kyotaro HORIGUCHI
Please forgive me to resend this message for some too-sad
misspellings.

# Waiting for heavy weight locks is somewhat confusing to spell..

===
Hello,

At Tue, 7 Jul 2015 16:27:38 +0900, Fujii Masao masao.fu...@gmail.com wrote in 
CAHGQGwEJwov8YwvmbbWps3Rba6kF1yf7qL3S==Oy4D=gq9y...@mail.gmail.com
 Each backend reports its event when trying to take a lock. But
 the reported event is never reset until next event is reported.
 Is this OK? This means that the wait_event column keeps showing
 the *last* event while a backend is in idle state, for example.
 So, shouldn't we reset the reported event or report another one
 when releasing the lock?

It seems so but pg_stat_activity.waiting would indicate whether
the event is lasting. However, .waiting reflects only the status
of heavy-weight locks. It would be quite misleading.

I think that pg_stat_activity.wait_event sould be linked to
.waiting then .wait_event should be restricted to heavy weight
locks if the meaning of .waiting cannot not be changed.

On the other hand, we need to have as many wait events as
Itagaki-san's patch did so pg_stat_activity might be the wrong
place for full-spec wait_event.

 +read_string_from_waitevent(uint8 wait_event)
 
 The performance of this function looks poor because its worst case
 is O(n): n is the number of all the events that we are trying to track.
 Also in pg_stat_activity, this function is called per backend.
 Can't we retrieve the event name by using wait event ID as an index
 of WaitEventTypeMap array?

+1

regards,


At Tue, 7 Jul 2015 16:27:38 +0900, Fujii Masao masao.fu...@gmail.com wrote in 
CAHGQGwEJwov8YwvmbbWps3Rba6kF1yf7qL3S==Oy4D=gq9y...@mail.gmail.com
 On Tue, Jun 30, 2015 at 10:30 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  On Fri, Jun 26, 2015 at 6:26 PM, Robert Haas robertmh...@gmail.com wrote:
 
  On Thu, Jun 25, 2015 at 11:57 PM, Amit Kapila amit.kapil...@gmail.com
  wrote:
3. Add new view 'pg_stat_wait_event' with following info:
pid   - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait,
etc
wait_event - Lock (Relation), Lock (Relation Extension), etc
   I am pretty unconvinced that it's a good idea to try to split up the
   wait event into two columns.  I'm only proposing ~20 wait states, so
   there's something like 5 bits of information here.  Spreading that
   over two text columns is a lot, and note that Amit's text would
   basically recapitulate the contents of the first column in the second
   one, which I cannot get excited about.
   There is an advantage in splitting the columns which is if
   wait_event_type
   column indicates Heavy Weight Lock, then user can go and check further
   details in pg_locks, I think he can do that even by seeing wait_event
   column, but that might not be as straightforward as with wait_event_type
   column.
 
  It's just a matter of writing event_type LIKE 'Lock %' instead of
  event_type = 'Lock'.
 
 
  Yes that way it can be done and may be that is not inconvenient for user,
  but then there is other type of information which user might need like what
  distinct resources on which wait is possible, which again he can easily find
  with
  different event_type column. I think some more discussion is required before
  we
  could freeze the user interface for this feature, but in the meantime I have
  prepared an initial patch by adding a new column wait_event in
  pg_stat_activity.
  For now, I have added the support for Heavy-Weight locks, Light-Weight locks
  [1]
  and Buffer Cleanup Lock.  I could add for other types (spin lock delay
  sleep, IO,
  network IO, etc.) if there is no objection in the approach used in patch to
  implement
  this feature.
 
  [1] For LWLocks, currently I have used wait_event as OtherLock for locks
  other than NUM_FIXED_LWLOCKS (Refer function NumLWLocks to see all
  type of LWLocks).  The reason is that there is no straight forward way to
  get
  the id (lockid) of such locks as for some of those (like shared_buffers,
  MaxBackends) the number of locks will depend on run-time configuration
  parameters.  I think if we want to handle those then we could either do some
  math to find out the lockid based on runtime values of these parameters or
  we could add tag in LWLock structure (which indicates the lock type) and
  fill it during Lock initialization or may be some other better way to do it.
 
  I have still not added documentation and have not changed anything for
  waiting column in pg_stat_activity as I think before that we need to
  finalize
  the user interface.  Apart from that as mentioned above still wait for
  some event types (like IO, netwrok IO, etc.) is not added and also I think
  separate function/'s (like we have for existing ones
  pg_stat_get_backend_waiting)
  will be required which again depends upon user interface.
 
 Yes, we need to discuss what events to track. As I suggested 

Re: [HACKERS] anole: assorted stability problems

2015-07-07 Thread Andres Freund
On 2015-06-30 11:35:56 +0200, Andres Freund wrote:
 On 2015-06-29 22:58:05 -0400, Tom Lane wrote:
  So personally, I would be inclined to put back the volatile qualifier,
  independently of any fooling around with _Asm_double_magic_xyzzy
  calls.

 I'm not sure. I think the reliance on an explicit memory barrier is a
 lot more robust and easy to understand than some barely documented odd
 behaviour around volatile. On the other hand the old way worked for a
 long while.

 I'm inclined to just do both on platforms as odd as IA6. But it'd like
 to let anole run with the current set a bit longer - if it doesn't work
 we have more problems than just S_UNLOCK(). It seems EDB has increased
 the run rate for now, so it shouldn't take too long:
 http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD

So, it's starting to look good. Not exactly allowing for a lot of
confidence yet, but still:
http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD

I'm inclined to simply revise the comments now, and *not* reintroduce
the volatile. The assumptions documented in:

/*
 * Intel Itanium, gcc or Intel's compiler.
 *
 * Itanium has weak memory ordering, but we rely on the compiler to enforce
 * strict ordering of accesses to volatile data.  In particular, while the
 * xchg instruction implicitly acts as a memory barrier with 'acquire'
 * semantics, we do not have an explicit memory fence instruction in the
 * S_UNLOCK macro.  We use a regular assignment to clear the spinlock, and
 * trust that the compiler marks the generated store instruction with the
 * .rel opcode.
 *
 * Testing shows that assumption to hold on gcc, although I could not find
 * any explicit statement on that in the gcc manual.  In Intel's compiler,
 * the -m[no-]serialize-volatile option controls that, and testing shows that
 * it is enabled by default.
 */

don't sound exactly bullet proof to me. I also personally find explicit
barriers easier to understand in the light of all the other spinlock
implementations.

Comments?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6 First Commitfest Begins

2015-07-07 Thread Heikki Linnakangas
The first week of the July commitfest has passed. A lot of progress has 
been made, but there's still a lot to do. There are still 53 patches in 
Needs Review state.


Please pick a patch, and review it. Any patch. Don't be afraid of 
reviewing a patch that someone else has signed up for.


If you have submitted a patch for review, please consider also reviewing 
someone else's patch. If no-one has signed up to review your patch, you 
can ask someone who you think would be a good reviewer to do so. He 
might say no, but you've got nothing to lose.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2015-07-07 Thread Heikki Linnakangas

On 02/17/2015 11:26 AM, Ashutosh Bapat wrote:

Hi All,

Here are the steps and infrastructure for achieving atomic commits across
multiple foreign servers. I have tried to address most of the concerns
raised in this mail thread before. Let me know, if I have left something.
Attached is a WIP patch implementing the same for postgres_fdw. I have
tried to make it FDW-independent.


Wow, this is going to be a lot of new infrastructure. This is going to 
need good documentation, explaining how two-phase commit works in 
general, how it's implemented, how to monitor it etc. It's important to 
explain all the possible failure scenarios where you're left with 
in-doubt transactions, and how the DBA can resolve them.


Since we're building a Transaction Manager into PostgreSQL, please put a 
lot of thought on what kind of APIs it provides to the rest of the 
system. APIs for monitoring it, configuring it, etc. And how an 
extension could participate in a transaction, without necessarily being 
an FDW.


Regarding the configuration, there are many different behaviours that an 
FDW could implement:


1. The FDW is read-only. Commit/abort behaviour is moot.
2. Transactions are not supported. All updates happen immediately 
regardless of the local transaction.
3. Transactions are supported, but two-phase commit is not. There are 
three different ways we can use the remote transactions in that case:

3.1. Commit the remote transaction before local transaction.
3.2. Commit the remote transaction after local transaction.
3.3. As long as there is only one such FDW involved, we can still do 
safe two-phase commit using so-called Last Resource Optimization.

4. Full two-phases commit support

We don't necessarily have to support all of that, but let's keep all 
these cases in mind when we design the how to configure FDWs. There's 
more to it than does it support 2PC.



A. Steps during transaction processing


1. When an FDW connects to a foreign server and starts a transaction, it
registers that server with a boolean flag indicating whether that server is
capable of participating in a two phase commit. In the patch this is
implemented using function RegisterXactForeignServer(), which raises an
error, thus aborting the transaction, if there is at least one foreign
server incapable of 2PC in a multiserver transaction. This error thrown as
early as possible. If all the foreign servers involved in the transaction
are capable of 2PC, the function just updates the information. As of now,
in the patch the function is in the form of a stub.

Whether a foreign server is capable of 2PC, can be
a. FDW level decision e.g. file_fdw as of now, is incapable of 2PC but it
can build the capabilities which can be used for all the servers using
file_fdw
b. a decision based on server version type etc. thus FDW can decide that by
looking at the server properties for each server
c. a user decision where the FDW can allow a user to specify it in the form
of CREATE/ALTER SERVER option. Implemented in the patch.

For a transaction involving only a single foreign server, the current code
remains unaltered as two phase commit is not needed.


Just to be clear: you also need two-phase commit if the transaction 
updated anything in the local server and in even one foreign server.



D. Persistent and in-memory storage considerations

I considered following options for persistent storage
1. in-memory table and file(s) - The foreign transaction entries are saved
and manipulated in shared memory. They are written to file whenever
persistence is necessary e.g. while registering the foreign transaction in
step A.2. Requirements C.1, C.2 need some SQL interface in the form of
built-in functions or SQL commands.

The patch implements the in-memory foreign transaction table as a fixed
size array of foreign transaction entries (similar to prepared transaction
entries in twophase.c). This puts a restriction on number of foreign
prepared transactions that need to be maintained at a time. We need
separate locks to syncronize the access to the shared memory; the patch
uses only a single LW lock. There is restriction on the length of prepared
transaction id (or prepared transaction information saved by FDW to be
general), since everything is being saved in fixed size memory. We may be
able to overcome that restriction by writing this information to separate
files (one file per foreign prepared transaction). We need to take the same
route as 2PC for C.5.


Your current approach with a file that's flushed to disk on every update 
has a few problems. Firstly, it's not crash safe. Secondly, if you make 
it crash-safe with fsync(), performance will suffer. You're going to 
need to need several fsyncs per commit with 2PC anyway, there's no way 
around that, but the scalable way to do that is to use the WAL so that 
one fsync() can flush more than one 

Re: [HACKERS] [BUGS] BUG #13126: table constraint loses its comment

2015-07-07 Thread Petr Jelinek

On 2015-07-04 13:45, Michael Paquier wrote:

On Fri, Jul 3, 2015 at 11:59 PM, Petr Jelinek wrote:

Well for indexes you don't really need to add the new AT command, as
IndexStmt has char *idxcomment which it will automatically uses as comment
if not NULL. While  I am not huge fan of the idxcomment it doesn't seem to
be easy to remove it in the future and it's what transformTableLikeClause
uses so it might be good to be consistent with that.


Oh, right, I completely missed your point and this field in IndexStmt.
Yes it is better to be consistent in this case and to use it. It makes
as well the code easier to follow.
Btw, regarding the new AT routines, I am getting find of them, it
makes easier to follow which command is added where in the command
queues.

Updated patch attached.



Cool, I am happy with the patch now. Marking as ready for committer.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-07-07 Thread Amit Kapila
On Fri, Jul 3, 2015 at 1:55 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Fri, Jul 3, 2015 at 1:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On 2 July 2015 at 16:30, Sawada Masahiko sawada.m...@gmail.com wrote:
 
 
  Also, the flags of each heap page header might be set PD_ALL_FROZEN,
  as well as all-visible
 
 
  Is it possible to have VM bits set to frozen but not visible?
 
  The description makes those two states sound independent of each other.
 
  Are they? Or not? Do we test for an impossible state?
 

 It's impossible to have VM bits set to frozen but not visible.

In patch, during Vacuum first the frozen bit is set and then the visibility
will be set in a later operation, now if the crash happens between those
2 operations, then isn't it possible that the frozen bit is set and visible
bit is not set?

 These bit are controlled independently. But eventually, when
 all-frozen bit is set, all-visible is also set.


Yes, during normal operations it will happen that way, but I think there
are corner cases where that assumption is not true.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Andres Freund
On 2015-07-06 20:00:43 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  Binturon has repeatedly failed with errors like:
  ERROR:  could not open file base/16400/32052: No such file or directory
 
 I agree that binturong seems to have something odd going on; but there are
 a lot of other intermittent pg_upgrade test failures in the buildfarm
 history

binturong seemed to be clean on HEAD for a while now, and the failures
~80 days ago seem to have had different symptoms (the src/bin move):
http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=binturongbr=HEAD

other branches are less nice looking for various reasons, but there's
another recurring error:
FATAL:  could not open relation mapping file global/pg_filenode.map: No such 
file or directory

Those seem to indicate something going seriously wrong to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics / patch v7

2015-07-07 Thread Tomas Vondra

Hello Horiguchi-san!

On 07/07/2015 09:43 PM, Tomas Vondra wrote:

-- histograms
ALTER TABLE t ADD STATISTICS (histogram) on (a,b,c);
ANALYZE t;

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 and c  0.3;
Seq Scan on t  (cost=0.00..23870.00 rows=267033 width=24)
(actual time=0.021..330.487 rows=273897 loops=1)

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 or c  0.3;
Seq Scan on t  (cost=0.00..23870.00 rows=14317 width=24)
(actual time=0.027..906.321 rows=966870 loops=1)

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 or c  0.9;
Seq Scan on t  (cost=0.00..23870.00 rows=20367 width=24)
(actual time=0.028..452.494 rows=393528 loops=1)

This seems wrong, because the estimate for the OR queries should not be
lower than the estimate for the first query (with just AND), and it
should not increase when increasing the boundary. I'd bet this is a bug
in how the inequalities are handled with histograms, or how the AND/OR
clauses are combined. I'll look into that.


FWIW this was a stupid bug in update_match_bitmap_histogram(), which 
initially handled only AND clauses, and thus assumed the match of a 
bucket can only decrease. But for OR clauses this is exactly the 
opposite (we assume no buckets match and add buckets matching at least 
one of the clauses).


With this fixed, the estimates look like this:

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 and c  0.3;
Seq Scan on t  (cost=0.00..23870.00 rows=267033 width=24)
   (actual time=0.102..321.524 rows=273897 loops=1)

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 or c  0.3;
Seq Scan on t  (cost=0.00..23870.00 rows=319400 width=24)
   (actual time=0.103..386.089 rows=317533 loops=1)

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 or c  0.3;
Seq Scan on t  (cost=0.00..23870.00 rows=956833 width=24)
   (actual time=0.133..908.455 rows=966870 loops=1)

EXPLAIN ANALYZE select * from t where a  0.3 and b  0.3 or c  0.9;
Seq Scan on t  (cost=0.00..23870.00 rows=393633 width=24)
   (actual time=0.105..440.607 rows=393528 loops=1)

IMHO pretty accurate estimates - no issue with OR clauses.

I've pushed this to github [1] but I need to do some additional fixes. I 
also had to remove some optimizations while fixing this, and will have 
to reimplement those.


That's not to say that the handling of OR-clauses is perfectly correct. 
After looking at clauselist_selectivity_or(), I believe it's a bit 
broken and will need a bunch of fixes, as explained in the FIXMEs I 
pushed to github.


[1] https://github.com/tvondra/postgres/tree/mvstats

kind regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Wed, Jul 8, 2015 at 2:40 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 07/07/2015 07:31 PM, Fujii Masao wrote:

 Or another crazy idea is to append random length dummy data into
 compressed FPW. Which would make it really hard for an attacker to
 guess the information from WAL location.


 It makes the signal more noisy, but you can still mount the same attack if
 you just average it over more iterations. You could perhaps fuzz it enough
 to make the attack impractical, but it doesn't exactly give me a warm fuzzy
 feeling.

If the attack is impractical, what makes you feel nervous?
Maybe we should be concerned about a brute-force and dictionary
attacks rather than the attack using wal_compression?
Because ISTM that they are more likely to be able to leak passwords
in practice.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Tue, Jul 7, 2015 at 11:34 PM, Stephen Frost sfr...@snowman.net wrote:
 * Fujii Masao (masao.fu...@gmail.com) wrote:
 On Tue, Jul 7, 2015 at 10:31 PM, Stephen Frost sfr...@snowman.net wrote:
  I'm not following.  If we don't want the information to be available to
  everyone then we need to limit it, and right now the only way to do that
  is to restrict it to superuser because we haven't got anything more
  granular.

 A user can very easily limit such information by not enabling 
 wal_compression.
 No need to restrict the usage of WAL location functions.
 Of course, as Michael suggested, we need to make the parameter SUSET
 so that only superuser can change the setting, though.

 I agree with making it SUSET, but that doesn't address the issue.

ISTM that one our consensus is to make wal_compression SUSET
as the first step whatever approach we adopt for the risk in question
later. So, barring any objection, I will commit the attached patch
and change the context to PGC_SUSET.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2303,2308  include_dir 'conf.d'
--- 2303,2309 
  xref linkend=guc-full-page-writes is on or during a base backup.
  A compressed page image will be decompressed during WAL replay.
  The default value is literaloff/.
+ Only superusers can change this setting.
 /para
  
 para
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 995,1001  static struct config_bool ConfigureNamesBool[] =
  	},
  
  	{
! 		{wal_compression, PGC_USERSET, WAL_SETTINGS,
  			gettext_noop(Compresses full-page writes written in WAL file.),
  			NULL
  		},
--- 995,1001 
  	},
  
  	{
! 		{wal_compression, PGC_SUSET, WAL_SETTINGS,
  			gettext_noop(Compresses full-page writes written in WAL file.),
  			NULL
  		},

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Fujii Masao
On Tue, Jul 7, 2015 at 10:31 PM, Stephen Frost sfr...@snowman.net wrote:
 * Heikki Linnakangas (hlinn...@iki.fi) wrote:
 On 07/07/2015 04:15 PM, Stephen Frost wrote:
 * Fujii Masao (masao.fu...@gmail.com) wrote:
 On Thu, Apr 16, 2015 at 4:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 + the compression ratio of a full page image gives a hint of what 
 is
 + the existing data of this page.  Tables that contain sensitive
 + information like structnamepg_authid/structname with 
 password
 + data could be potential targets to such attacks. Note that as a
 + prerequisite a user needs to be able to insert data on the same 
 page
 + as the data targeted and need to be able to detect checkpoint
 + presence to find out if a compressed full page write is 
 included in
 + WAL to calculate the compression ratio of a page using WAL 
 positions
 + before and after inserting data on the page with data targeted.
 +/para
 +   /warning
 
 I think that, in addition to the description of the risk, it's better to
 say that this vulnerability is cumbersome to exploit in practice.
 
 Attached is the updated version of the patch. Comments?
 
 Personally, I don't like simply documenting this issue.  I'd much rather
 we restrict the WAL info as it's really got no business being available
 to the general public.  I'd be fine with restricting that information to
 superusers when compression is enabled, or always, for 9.5 and then
 fixing it properly in 9.6 by allowing it to be visible to a
 pg_monitoring default role which admins can then grant to users who
 should be able to see it.

 Well, then you could still launch the same attack if you have the
 pg_monitoring privileges. So it would be more like a monitoring and
 grab everyone's passwords privilege ;-). Ok, in seriousness the
 attack isn't that easy to perform, but I still wouldn't feel
 comfortable giving that privilege to anyone who isn't a superuser
 anyway.

 The alternative is to have monitoring tools which are running as
 superuser, which, in my view at least, is far worse.

 Yes, we'll get flak from people who are running with non-superuser
 monitoring tools today, but we already have a bunch of superuser-only
 things that monioring tools want, so this doesn't move the needle
 particularly far, in my view.

 That would be a major drawback IMHO, and a step in the wrong direction.

 I'm not following.  If we don't want the information to be available to
 everyone then we need to limit it, and right now the only way to do that
 is to restrict it to superuser because we haven't got anything more
 granular.

A user can very easily limit such information by not enabling wal_compression.
No need to restrict the usage of WAL location functions.
Of course, as Michael suggested, we need to make the parameter SUSET
so that only superuser can change the setting, though.

Or you're concerned about the case where a careless user enables
wal_compression unexpectedly even though he or she thinks the risk
very serious? Yes, in this case, non-superuser may be able to exploit
the vulnerability by seeing the WAL position. But there are many cases
where improper setting causes unwanted result. So I'm not sure why
we need to treat wal_compression so special.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-07 Thread Pavel Stehule
2015-07-07 15:56 GMT+02:00 Merlin Moncure mmonc...@gmail.com:

 On Tue, Jul 7, 2015 at 8:13 AM, Heikki Linnakangas hlinn...@iki.fi
 wrote:
  On 01/26/2015 05:14 PM, Tom Lane wrote:
 
  Pavel Stehule pavel.steh...@gmail.com writes:
 
  2015-01-26 14:02 GMT+01:00 Marko Tiikkaja ma...@joh.to:
  I am thinking, so solution
 
 
/* if we are doing RAISE, don't report its location */
   if (estate-err_text == raise_skip_msg)
   return;
 
 
  is too simple, and this part should be fixed. This change can be done
 by
  on
  plpgsql or libpq side. This is bug, and it should be fixed.
 
 
  Doing this in libpq is utterly insane.  It has not got sufficient
 context
  to do anything intelligent.  The fact that it's not intelligent is
 exposed
  by the regression test changes that the proposed patch causes, most of
  which do not look like improvements.
 
  How can the server know if the client wants to display context
 information?

 It doesn't have to if the behavior is guarded with a GUC.  I just
 don't understand what all the fuss is about.  The default behavior of
 logging that is well established by other languages (for example java)
 that manage error stack for you should be to:

 *) Give stack trace when an uncaught exception is thrown
 *) Do not give stack trace in all other logging cases unless asked for


what is RAISE EXCEPTION - first or second case?



 I would be happy to show you the psql redirected output logs from my
 nightly server processes that spew into the megabytes because of
 logging various high level steps (did this, did that).   I can't throw
 the verbose switch to terse because if the error happens to be
 'Division by Zero', or some other difficult to trace problem then I'm
 sunk.  I believe the protocol decision to 'always send context' needs
 to be revisited; if your server-side codebase is large and heavily
 nested it makes logging an expensive operation even if the client
 strips off the log.

 plpgsql.min_context seems like the ideal solution to this problem; it
 can be managed on the server or the client and does not require new
 syntax.  If we require syntax to slip and and out of debugging type
 operations the solution has missed the mark IMNSHO.

 merlin



Re: [HACKERS] [PATCH] Add missing \ddp psql command

2015-07-07 Thread Fujii Masao
On Tue, Jul 7, 2015 at 10:44 PM, David Christensen da...@endpoint.com wrote:

 On Jul 7, 2015, at 3:53 AM, Fujii Masao masao.fu...@gmail.com wrote:

 On Tue, Jul 7, 2015 at 2:11 AM, David Christensen da...@endpoint.com wrote:
 Quickie patch for spotted missing psql \ddp tab-completion.

 Thanks for the report and patch!

 I found that tab-completion was not supported in not only \ddp
 but also other psql meta commands like \dE, \dm, \dO, \dy, \s and
 \setenv. Attached patch adds all those missing tab-completions.

 From a completeness standpoint makes sense to me to have all of them, but 
 from a practical standpoint a single-character completion like \s isn’t going 
 to do much. :)

Sometimes I type TAB after \ to display all the psql meta commands.
Even single-character completion like \s may be useful for that case.
Yeah, I agree that's narrow use case, though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comfortably check BackendPID with psql

2015-07-07 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 Pushed the patch. I only made a minor belt-and-suspenders type of
 change, namely to check whether PQbackendPID() returns 0 and not print
 that and replaced PID by pid in the docs and comments.

I would s/pid/process ID/ in the docs.  PID is not a particularly
user-friendly term, and it's even less so if you fail to upper-case it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_trgm version 1.2

2015-07-07 Thread Alexander Korotkov
On Tue, Jun 30, 2015 at 11:28 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov 
 a.korot...@postgrespro.ru wrote:

 On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 This patch implements version 1.2 of contrib module pg_trgm.

 This supports the triconsistent function, introduced in version 9.4 of
 the server, to make it faster to implement indexed queries where some keys
 are common and some are rare.


 Thank you for the patch! Lack of pg_trgm triconsistent support was
 significant miss after fast scan implementation.


 I've included the paths to both upgrade and downgrade between 1.1 and
 1.2, although after doing so you must close and restart the session before
 you can be sure the change has taken effect. There is no change to the
 on-disk index structure


 pg_trgm--1.1.sql andpg_trgm--1.1--1.2.sql are useful for debug, but do
 you expect them in final commit? As I can see in other contribs we have
 only last version and upgrade scripts.



 I had thought that pg_trgm--1.1.sql was needed for pg_upgrade to work, but
 I see that that is not the case.

 I did see another downgrade path for different module, but on closer
 inspection it was one that I wrote while trying to analyze that module, and
 then never removed.  I have no objection to removing pg_trgm--1.2--1.1.sql
 before the commit, but I don't see what we gain by doing so.  If a
 downgrade is feasible and has been tested, why not include it?


See Tom Lane's comment about downgrade scripts. I think just remove it is a
right solution.


 You could get the same benefit just by increasing MAX_MAYBE_ENTRIES (in
 core) from 4 to some higher value (which it probably should be anyway, but
 there will always be a case where it needs to be higher than you can afford
 it to be, so a real solution is needed).


 Actually, it depends on how long it takes to calculate consistent
 function. The cheaper consistent function is, the higher MAX_MAYBE_ENTRIES
 could be. Since all functions in PostgreSQL may define its cost, GIN could
 calculate MAX_MAYBE_ENTRIES from the cost of consistent function.


 The consistent function gets called on every candidate tuple, so if it is
 expensive then it is also all the more worthwhile to reduce the set of
 candidate tuples.  Perhaps MAX_MAYBE_ENTRIES could be calculated from the
 log of the maximum of the predictNumberResult entries? Anyway, a subject
 for a different day


Yes, that also a point. However, we optimize not only costs of consistent
calls but also costs of getting candidate item pointers which are both CPU
and IO.

There may also be some gains in the similarity and regex cases, but I
 didn't really analyze those for performance.



 I've thought about how to document this change.  Looking to other
 example of other contrib modules with multiple versions, I decided that we
 don't document them, other than in the release notes.


 The same patch applies to 9.4 code with a minor conflict in the
 Makefile, and gives benefits there as well.


 Some other notes about the patch:
 * You allocate boolcheck array and don't use it.


 That was a bug (at least notionally).  trigramsMatchGraph was supposed to
 be getting boolcheck, not check, passed in to it.

 It may not have been a bug in practise, because GIN_MAYBE and GIN_TRUE
 both test as true when cast to booleans. But it seems wrong to rely on
 that.  Or was it intended to work this way?

 I'm surprised the compiler suite doesn't emit some kind of warning on that.


I'm not sure. It's attractive to get rid of useless memory allocation and
copying.
You can also change trigramsMatchGraph() to take GinTernaryValue *
argument. Probably casting bool * as GinTernaryValue * looks better than
inverse casting.




 * Check coding style and formatting, in particular check[i]==GIN_TRUE
 should be check[i] == GIN_TRUE.


 Sorry about that, fixed.  I also changed it in other places to check[i] !=
 GIN_FALSE, rather than checking against both GIN_TRUE and GIN_MAYBE.  At
 first I was concerned we might decide to add a 4th option to the type which
 would render  != GIN_FALSE the wrong way to test for it.  But since it is
 called GinTernaryValue, I think we wouldn't add a fourth thing to it.  But
 perhaps the more verbose form is clearer to some people.


Thanks!


 * I think some comments needed in gin_trgm_triconsistent() about
 trigramsMatchGraph(). gin_trgm_triconsistent() may use trigramsMatchGraph()
 that way because trigramsMatchGraph() implements monotonous boolean
 function.


 I have a function-level comment that in all cases, GIN_TRUE is at least as
 favorable to inclusion of a tuple as GIN_MAYBE.  Should I reiterate that
 comment before trigramsMatchGraph() as well?  Calling it a monotonic
 boolean function is precise and concise, but probably less understandable
 to people reading the code.  At least, if I saw that, I would need to go do
 some reading before I knew what it meant.


Let's 

Re: [HACKERS] creating extension including dependencies

2015-07-07 Thread Andres Freund
On 2015-07-07 22:36:29 +0900, Fujii Masao wrote:
 On Mon, Jun 15, 2015 at 7:50 AM, Petr Jelinek p...@2ndquadrant.com wrote:
  Hi,
 
  I am getting tired installing manually required extensions manually. I was
  wondering if we might want to add option to CREATE SEQUENCE that would allow
  automatic creation of the extensions required by the extension that is being
  installed by the user.
 
 I'm wondering how much helpful this feature is. Because, even if we can save
 some steps for CREATE EXTENSION by using the feature, we still need to
 manually find out, download and install all the extensions that the target
 extension depends on. So isn't it better to implement the tool like yum, i.e.,
 which performs all those steps almost automatically, rather than the proposed
 feature? Maybe it's outside PostgreSQL core.

That doesn't seem to make much sense to me. Something like yum can't
install everything in all relevant databases. Sure, yum will be used to
install dependencies between extensions on the filesystem level.

At the minimum I'd like to see that CREATE EXTENSION foo; would install
install extension 'bar' if foo dependended on 'bar' if CASCADE is
specified. Right now we always error out saying that the dependency on
'bar' is not fullfilled - not particularly helpful.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add missing \ddp psql command

2015-07-07 Thread David Christensen

 On Jul 7, 2015, at 3:53 AM, Fujii Masao masao.fu...@gmail.com wrote:
 
 On Tue, Jul 7, 2015 at 2:11 AM, David Christensen da...@endpoint.com wrote:
 Quickie patch for spotted missing psql \ddp tab-completion.
 
 Thanks for the report and patch!
 
 I found that tab-completion was not supported in not only \ddp
 but also other psql meta commands like \dE, \dm, \dO, \dy, \s and
 \setenv. Attached patch adds all those missing tab-completions.

From a completeness standpoint makes sense to me to have all of them, but from 
a practical standpoint a single-character completion like \s isn’t going to do 
much. :)

David
--
David Christensen
PostgreSQL Team Manager
End Point Corporation
da...@endpoint.com
785-727-1171







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix broken Install.bat when target directory contains a space

2015-07-07 Thread Heikki Linnakangas

On 07/07/2015 02:56 AM, Tom Lane wrote:

Michael Paquier michael.paqu...@gmail.com writes:

On Tue, Jul 7, 2015 at 4:19 AM, Heikki Linnakangas hlinn...@iki.fi wrote:

Ok, committed that way.



Shoudn't this patch be backpatched? In the backbranches install.bat
does not work correctly with paths containing spaces.


I was about to ask the same.  AFAICS, the other scripts have been similar
one-liners at least since 9.0.


Ok then, pushed to back-branches too.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Redundant error messages in policy.c

2015-07-07 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/06/2015 02:26 PM, Stephen Frost wrote:
 Daniele,
 
 * Daniele Varrazzo (daniele.varra...@gmail.com) wrote:
 There are 5 different strings (one has a whitespace error), they 
 could be 2. Patch attached.
 
 Fair point.  I did try to address the language around policy vs. 
 row security vs. row level security, but didn't look as closely as 
 I should have at the error messages overall.
 
 Will address.

Pushed. Needed a small change to the expected regression output to match
.

Joe


- -- 
Joe Conway
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVm9t6AAoJEDfy90M199hlcXgP+wbk8YhPdlt18rtKdxMWF7OW
B+D05Rf09st+PmnqsWUtTgMjuukEgc7yFQMMmYWj2AWjLaoLaYX6C+7lBQ0ZDsyh
Ye2G9e/GHfkNc99bDIQ4QOQZo9l0+y9evlWfu+LJVcb9Ai108F9hk9fFm2m4OG9p
k7Kj79XxXBqoni8PTiqJ7X29uX2i6Zd0Zkah0AR87B+IjgzJJrKKEWUqiehTRLbb
5wkjaTiHfad06Bs9R/guMKSDzTqihBaC2yd34zemlItbqn3Ib7pZCjTJaV1s1bOO
9tae1j/uymmBxIot3Ys0LxebCb6i3uGa5F4rEk1q0f7NIa9wSdfDPcBjIqDsn2WY
yRNYCbFtVNXj4ehYLeGuz3zkjMGFQzq+7dJPuKkuHUBB50LCt93yQyMbxQ4YB3Fq
OOWZUsxfYOJM4uW8BvSltbq0PSqyo/I4/SJe6yweJsgAXXlzS8EkxMC17vGdr457
ERCSlxXESJ/+tL35GMiujsgSbQZMUu+fxe6bcH3zT4c1nbS8dEpHMm4G+Nojq6b8
pL9sB8txKc0utjVg63nb3oF3hPC25gXk9DHC8bOVUkp77o2TjhroixKvuTN+to+o
JLMseH06s7ZU69b8QNu1YtkeLPxOKZ7INHXSAZ6a2uXaRgNr8nMtJF+mSV0i9XlK
a1e9wR6QBq81PUMiZURN
=LgG6
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-07-07 Thread Alexander Korotkov
Hi!

On Sat, Mar 28, 2015 at 10:35 AM, David Rowley dgrowle...@gmail.com wrote:

 On 25 March 2015 at 01:11, Kyotaro HORIGUCHI 
 horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hi, thanks for the new patch.

 I made an additional shrink from your last one. Do you have a
 look on the attached?


 Thanks, for looking again.

 I'm not too sure I like the idea of relying on join removals to mark the
 is_unique_join property.
 By explicitly doing it in mark_unique_joins we have the nice side effect
 of not having to re-check a relations unique properties after removing
 another relation, providing the relation was already marked as unique on
 the first pass.


 At Sun, 22 Mar 2015 19:42:21 +1300, David Rowley dgrowle...@gmail.com
 wrote in 
 caaphdvrkwmmtwkxfn4uazyza9jql1c7uwbjbtuwfr69rqlv...@mail.gmail.com
  On 20 March 2015 at 21:11, David Rowley dgrowle...@gmail.com wrote:
  
   I can continue working on your patch if you like? Or are you planning
 to
   go further with it?

 It's fine that you continue to work on this.

 # Sorry for the hardly baked patch which had left many things alone:(

  I've been working on this more over the weekend and I've re-factored
 things
  to allow LEFT JOINs to be properly marked as unique.
  I've also made changes to re-add support for detecting the uniqueness of
  sub-queries.

 I don't see the point of calling mark_unique_joins for every
 iteration on join_info_list in remove_useless_joins.


 I've fixed this in the attached. I must have forgotten to put the test for
 LEFT JOINs here as I was still thinking that I might make a change to the
 code that converts unique semi joins to inner joins so that it just checks
 is_unique_join instead of calling relation_has_unique_index_for().


Patch doesn't apply to current master. Could you, please, rebase it?

patching file src/backend/commands/explain.c
Hunk #1 succeeded at 1193 (offset 42 lines).
patching file src/backend/executor/nodeHashjoin.c
patching file src/backend/executor/nodeMergejoin.c
patching file src/backend/executor/nodeNestloop.c
patching file src/backend/nodes/copyfuncs.c
Hunk #1 succeeded at 2026 (offset 82 lines).
patching file src/backend/nodes/equalfuncs.c
Hunk #1 succeeded at 837 (offset 39 lines).
patching file src/backend/nodes/outfuncs.c
Hunk #1 succeeded at 2010 (offset 62 lines).
patching file src/backend/optimizer/path/costsize.c
Hunk #1 succeeded at 1780 (offset 68 lines).
Hunk #2 succeeded at 1814 with fuzz 2 (offset 68 lines).
Hunk #3 succeeded at 1887 with fuzz 2 (offset 38 lines).
Hunk #4 succeeded at 2759 (offset 97 lines).
patching file src/backend/optimizer/path/joinpath.c
Hunk #1 succeeded at 19 with fuzz 2 (offset 1 line).
Hunk #2 FAILED at 30.
Hunk #3 succeeded at 46 (offset -5 lines).
Hunk #4 succeeded at 84 with fuzz 2 (offset -8 lines).
Hunk #5 FAILED at 241.
Hunk #6 FAILED at 254.
Hunk #7 FAILED at 284.
Hunk #8 FAILED at 299.
Hunk #9 succeeded at 373 with fuzz 2 (offset 4 lines).
Hunk #10 succeeded at 385 with fuzz 2 (offset 4 lines).
Hunk #11 FAILED at 411.
Hunk #12 succeeded at 470 with fuzz 2 (offset 1 line).
Hunk #13 FAILED at 498.
Hunk #14 succeeded at 543 with fuzz 2 (offset -3 lines).
Hunk #15 FAILED at 604.
Hunk #16 FAILED at 617.
Hunk #17 FAILED at 748.
Hunk #18 FAILED at 794.
Hunk #19 FAILED at 808.
Hunk #20 FAILED at 939.
Hunk #21 FAILED at 966.
Hunk #22 FAILED at 982.
Hunk #23 FAILED at 1040.
Hunk #24 FAILED at 1140.
Hunk #25 FAILED at 1187.
Hunk #26 FAILED at 1222.
Hunk #27 FAILED at 1235.
Hunk #28 FAILED at 1310.
Hunk #29 FAILED at 1331.
Hunk #30 FAILED at 1345.
Hunk #31 FAILED at 1371.
Hunk #32 FAILED at 1410.
25 out of 32 hunks FAILED -- saving rejects to file
src/backend/optimizer/path/joinpath.c.rej
patching file src/backend/optimizer/path/joinrels.c
patching file src/backend/optimizer/plan/analyzejoins.c
patching file src/backend/optimizer/plan/createplan.c
Hunk #1 succeeded at 135 (offset 4 lines).
Hunk #2 succeeded at 155 (offset 4 lines).
Hunk #3 succeeded at 2304 (offset 113 lines).
Hunk #4 succeeded at 2598 (offset 113 lines).
Hunk #5 succeeded at 2724 (offset 113 lines).
Hunk #6 succeeded at 3855 (offset 139 lines).
Hunk #7 succeeded at 3865 (offset 139 lines).
Hunk #8 succeeded at 3880 (offset 139 lines).
Hunk #9 succeeded at 3891 (offset 139 lines).
Hunk #10 succeeded at 3941 (offset 139 lines).
Hunk #11 succeeded at 3956 (offset 139 lines).
patching file src/backend/optimizer/plan/initsplan.c
patching file src/backend/optimizer/plan/planmain.c
patching file src/backend/optimizer/util/pathnode.c
Hunk #1 succeeded at 1541 (offset 27 lines).
Hunk #2 succeeded at 1557 (offset 27 lines).
Hunk #3 succeeded at 1610 (offset 27 lines).
Hunk #4 succeeded at 1625 (offset 27 lines).
Hunk #5 succeeded at 1642 (offset 27 lines).
Hunk #6 succeeded at 1670 (offset 27 lines).
Hunk #7 succeeded at 1688 (offset 27 lines).
Hunk #8 succeeded at 1703 (offset 27 lines).
Hunk #9 succeeded at 1741 (offset 27 lines).
patching file src/include/nodes/execnodes.h
Hunk #1 succeeded at 1636 

Re: [HACKERS] Spurious full-stop in message

2015-07-07 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/06/2015 03:01 PM, Stephen Frost wrote:
 * Daniele Varrazzo (daniele.varra...@gmail.com) wrote:
 postgresql/src/backend$ grep must be superuser to change
 bypassrls attribute commands/user.c | sed 's/   \+//' 
 errmsg(must be superuser to change bypassrls attribute.))); 
 errmsg(must be superuser to change bypassrls attribute)));
 
 Patch to fix attached.
 
 Will fix.

Pushed


- -- 
Joe Conway
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVm9ubAAoJEDfy90M199hliYQQAJuKVvUptuFckKfliOZCE7zk
z+YXE8JiN1lycwccGxjEASxlX35TzJvzDjjLLjKvS7RfRy1Ghq0We+dokS9ieNjY
gt9yMuAxlYaJZq/i4CYVGg9A1QzOzl0OV2avtTKQvhn3cO7lRTIgZRL6HEFX6kTL
6p6TGKsbZdgh1S2xQroTejDcexV7ghyUq+C5/gis7b6dg8py9jqvggUYcclWj82Y
VP6mBxQvDjgfRZ2/RRA6ebUpoxZYs5/M7ddrek/YjKcouw4Y3lltxXLireF87NDx
lHrB/k1nRmnm1A9SKhaf3Qp8ejvmqLJvu0OMzMuw2s2BqGFLlC+C8QGUVfKWRmp0
GVPf6PkAZrsC6715CvmifRzmq+TiG6e4KL/0JxBpQp8Al81LJueE30m2bGyuV/PI
yXKb0AWwWL8xbwDjGmIC64W00DDGDfubblGaA8iHIzwd8vMlA14mSa85rK9Y5n3+
rK9hh9kEDyz7SHwZcQF2HNb0MVFHcQBxWBzlRk0cmy/mLHeNwVyEBwM8wTmT/1C7
C9FxoO2eM6eYWpBzEslgEz4wbAkVTFTR8YdgHZuHGxEJBnYh897RtvhmdoIywQZ8
JMfgvEXn2w7aXT+okE8uCscSLc/7CXgL9zwvtfYTbpGzcKFGFZ5Hz+g26jDcmQLg
V/xDMQYA88E4yIz/8eVR
=G+Xi
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Repeated pg_upgrade buildfarm failures on binturon

2015-07-07 Thread Oskari Saarenmaa
07.07.2015, 14:21, Andres Freund kirjoitti:
 On 2015-07-06 20:00:43 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
 Binturon has repeatedly failed with errors like:
 ERROR:  could not open file base/16400/32052: No such file or directory

 I agree that binturong seems to have something odd going on; but there are
 a lot of other intermittent pg_upgrade test failures in the buildfarm
 history
 
 binturong seemed to be clean on HEAD for a while now, and the failures
 ~80 days ago seem to have had different symptoms (the src/bin move):
 http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=binturongbr=HEAD
 
 other branches are less nice looking for various reasons, but there's
 another recurring error:
 FATAL:  could not open relation mapping file global/pg_filenode.map: No 
 such file or directory
 
 Those seem to indicate something going seriously wrong to me.

Binturong and Dingo run on the same host with a hourly cronjob to
trigger the builds.  These failures are caused by concurrent test runs
on different branches which use the same tmp_check directory for
pg_upgrade tests, see
http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dingodt=2015-07-07%2002%3A58%3A01stg=check-pg_upgrade

It looks like neither make (GNU Make 4.0) nor shell (default Solaris
/bin/sh) updates $PWD to point to the current directory where test.sh is
executed and test.sh puts the test cluster in the original working
directory of the process that launched make.

I've restricted builds to one at a time on that host to work around this
issue for now.  Also attached a patch to explicitly set PWD=$(CURDIR) in
the Makefile to make sure test.sh runs with the right directory.

/ Oskari
From 61b18821553aa8193e46ad66904fabacb5a5a50a Mon Sep 17 00:00:00 2001
From: Oskari Saarenmaa o...@ohmu.fi
Date: Tue, 7 Jul 2015 16:19:42 +0300
Subject: [PATCH] pg_upgrade: explicitly set PWD=$(CURDIR) to work around
 solaris issue

---
 src/bin/pg_upgrade/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index d9c8145..3e338e0 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -35,7 +35,7 @@ clean distclean maintainer-clean:
 	   pg_upgrade_dump_*.custom pg_upgrade_*.log
 
 check: test.sh all
-	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) EXTRA_REGRESS_OPTS=$(EXTRA_REGRESS_OPTS) $(SHELL) $ --install
+	MAKE=$(MAKE) bindir=$(bindir) libdir=$(libdir) EXTRA_REGRESS_OPTS=$(EXTRA_REGRESS_OPTS) PWD=$(CURDIR) $(SHELL) $ --install
 
 # disabled because it upsets the build farm
 #installcheck: test.sh
-- 
1.8.4.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FPW compression leaks information

2015-07-07 Thread Stephen Frost
* Heikki Linnakangas (hlinn...@iki.fi) wrote:
 On 07/07/2015 04:15 PM, Stephen Frost wrote:
 * Fujii Masao (masao.fu...@gmail.com) wrote:
 On Thu, Apr 16, 2015 at 4:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 + the compression ratio of a full page image gives a hint of what 
 is
 + the existing data of this page.  Tables that contain sensitive
 + information like structnamepg_authid/structname with password
 + data could be potential targets to such attacks. Note that as a
 + prerequisite a user needs to be able to insert data on the same 
 page
 + as the data targeted and need to be able to detect checkpoint
 + presence to find out if a compressed full page write is included 
 in
 + WAL to calculate the compression ratio of a page using WAL 
 positions
 + before and after inserting data on the page with data targeted.
 +/para
 +   /warning
 
 I think that, in addition to the description of the risk, it's better to
 say that this vulnerability is cumbersome to exploit in practice.
 
 Attached is the updated version of the patch. Comments?
 
 Personally, I don't like simply documenting this issue.  I'd much rather
 we restrict the WAL info as it's really got no business being available
 to the general public.  I'd be fine with restricting that information to
 superusers when compression is enabled, or always, for 9.5 and then
 fixing it properly in 9.6 by allowing it to be visible to a
 pg_monitoring default role which admins can then grant to users who
 should be able to see it.
 
 Well, then you could still launch the same attack if you have the
 pg_monitoring privileges. So it would be more like a monitoring and
 grab everyone's passwords privilege ;-). Ok, in seriousness the
 attack isn't that easy to perform, but I still wouldn't feel
 comfortable giving that privilege to anyone who isn't a superuser
 anyway.

The alternative is to have monitoring tools which are running as
superuser, which, in my view at least, is far worse.

 Yes, we'll get flak from people who are running with non-superuser
 monitoring tools today, but we already have a bunch of superuser-only
 things that monioring tools want, so this doesn't move the needle
 particularly far, in my view.
 
 That would be a major drawback IMHO, and a step in the wrong direction.

I'm not following.  If we don't want the information to be available to
everyone then we need to limit it, and right now the only way to do that
is to restrict it to superuser because we haven't got anything more
granular.

In other words, it seems like your above response about not wanting this
to be available to anyone except superusers is counter to this last
sentence where you seem to be saying we should continue to have the
information available to everyone and simply document that there's a
risk there as in the proposed patch.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] auto_explain sample rate

2015-07-07 Thread Julien Rouhaud
On 05/07/2015 18:22, Julien Rouhaud wrote:
 On 03/06/2015 15:00, Craig Ringer wrote:


 On 3 June 2015 at 20:04, Andres Freund and...@anarazel.de
 mailto:and...@anarazel.de wrote:

 On 2015-06-03 18:54:24 +0800, Craig Ringer wrote:
  OK, here we go.

 Hm. Wouldn't random sampling be better than what you do? If your queries
 have a pattern to them (e.g. you always issue the same 10 queries in
 succession), this will possibly only show a subset of the queries.

 I think a formulation in fraction (i.e. a float between 0 and 1) will
 also be easier to understand.


 Could be, yeah. I was thinking about the cost of generating a random
 each time, but it's going to vanish in the noise compared to the rest of
 the costs in query execution.

 
 Hello, I've just reviewed the patch.
 
 I'm not sure if there's a consensus on the sample rate format.  FWIW, I
 also think a fraction would be easier to understand.  Any news about
 generating a random at each call to avoid the query pattern problem ?
 
 The patch applies without error.  I wonder if there's any reason for
 using pg_lrand48() instead of random(), as there's a port for random()
 if the system lacks it.
 
 After some quick checks, I found that auto_explain_sample_counter is
 always initialized with the same value.  After some digging, it seems
 that pg_lrand48() always returns the same values in the same order, at
 least on my computer.  Have I missed something?
 

Well, I obviously missed that pg_srand48() is only used if the system
lacks random/srandom, sorry for the noise.  So yes, random() must be
used instead of pg_lrand48().

I'm attaching a new version of the patch fixing this issue just in case.


 Otherwise, after replacing the pg_lrand48() call with a random(), it
 works just fine.
 
 ---
  Craig Ringer   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services
 
 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
*** a/contrib/auto_explain/auto_explain.c
--- b/contrib/auto_explain/auto_explain.c
***
*** 29,34  static bool auto_explain_log_triggers = false;
--- 29,35 
  static bool auto_explain_log_timing = true;
  static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
  static bool auto_explain_log_nested_statements = false;
+ static int  auto_explain_sample_ratio = 1;
  
  static const struct config_enum_entry format_options[] = {
  	{text, EXPLAIN_FORMAT_TEXT, false},
***
*** 47,55  static ExecutorRun_hook_type prev_ExecutorRun = NULL;
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  
  #define auto_explain_enabled() \
  	(auto_explain_log_min_duration = 0  \
! 	 (nesting_level == 0 || auto_explain_log_nested_statements))
  
  void		_PG_init(void);
  void		_PG_fini(void);
--- 48,61 
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
  static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
  
+ /* per-backend counter used for ratio sampling */
+ static int  auto_explain_sample_counter = 0;
+ 
  #define auto_explain_enabled() \
  	(auto_explain_log_min_duration = 0  \
! 	 (nesting_level == 0 || auto_explain_log_nested_statements))  \
! 	 (auto_explain_sample_ratio == 1 || auto_explain_sample_counter == 0)
! 
  
  void		_PG_init(void);
  void		_PG_fini(void);
***
*** 61,66  static void explain_ExecutorRun(QueryDesc *queryDesc,
--- 67,85 
  static void explain_ExecutorFinish(QueryDesc *queryDesc);
  static void explain_ExecutorEnd(QueryDesc *queryDesc);
  
+ static void
+ auto_explain_sample_ratio_assign_hook(int newval, void *extra)
+ {
+ 	if (auto_explain_sample_ratio != newval)
+ 	{
+ 		/* Schedule a counter reset when the sample ratio changed */
+ 		auto_explain_sample_counter = -1;
+ 	}
+ 
+ 	auto_explain_sample_ratio = newval;
+ }
+ 
+ 
  
  /*
   * Module load callback
***
*** 159,164  _PG_init(void)
--- 178,195 
  			 NULL,
  			 NULL);
  
+ 	DefineCustomIntVariable(auto_explain.sample_ratio,
+ 		Only explain one in approx. every sample_ratio queries, or 1 for all,
+ 			NULL,
+ 			auto_explain_sample_ratio,
+ 			1,
+ 			1, INT_MAX - 1,
+ 			PGC_SUSET,
+ 			0,
+ 			NULL,
+ 			auto_explain_sample_ratio_assign_hook,
+ 			NULL);
+ 
  	EmitWarningsOnPlaceholders(auto_explain);
  
  	/* Install hooks. */
***
*** 191,196  _PG_fini(void)
--- 222,250 
  static void
  explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
  {
+ 	/*
+ 	 * For ratio sampling, only increment the counter for top-level
+ 	 * statements. Either all nested statements will be explained
+ 	 * or none will, because we need to know at ExecutorEnd hook time
+ 	 * whether or not we explained any given statement.
+ 	 */
+ 	if (nesting_level == 0  auto_explain_sample_ratio  1)
+ 	{
+ 		if (auto_explain_sample_counter == -1)
+ 		{
+ 			/*
+ 			 * First time the hook ran in this 

Re: [HACKERS] replication slot restart_lsn initialization

2015-07-07 Thread Gurjeet Singh
On Tue, Jul 7, 2015 at 4:59 AM, Andres Freund and...@anarazel.de wrote:

 On 2015-06-10 13:13:41 -0700, Gurjeet Singh wrote:
  + /*
  +  * Log an xid snapshot for logical replication.
 It's not needed for
  +  * physical slots as it is done in BGWriter on a
 regular basis.
  +  */
  + if (!slot-data.persistency == RS_PERSISTENT)
  + {
  + /* make sure we have enough information to
 start */
  + flushptr = LogStandbySnapshot();
  +
  + /* and make sure it's fsynced to disk */
  + XLogFlush(flushptr);
  + }

 Huh? The slot-data.persistency == RS_PERSISTENT check seems pretty much
 entirely random to me.


There seems to be a misplaced not operator  ! in that if statement, as
well. That sucks :( The MacOS gcc binary is actually clang, and its output
is too noisy [1], which is probably why I might have missed warning if any.

[1]: I am particularly annoyed by these:

note: remove extraneous parentheses around the comparison to silence this
warning
note: use '=' to turn this equality comparison into an assignment

Eg. : if (((opaque)-btpo_next == 0))

I'll see what I can do about these.


 I mean physical slots can (and normally are)
 persistent as well?  Instead check whether it's a database specifics lot.


Agreed, the slot being database-specific is the right indicator.



 The reasoning why it's not helpful for physical slots is wrong. The
 point is that a standby snapshot at that location will simply not help
 physical replication, it's only going to read ones after the location at
 which the base backup starts (i.e. the location from the backup label).

   pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
   {
Namename = PG_GETARG_NAME(0);
  + boolactivate = PG_GETARG_BOOL(1);

 Don't like 'activate' much as a name. How about 'immediately_reserve'?


I still like 'activate, but okay. How about 'reserve_immediately' instead?

Also, do you want this name change just in the C code, or in the pg_proc
and docs as well?



 Other than that it's looking good to me.


Will send a new patch after your feedback on the 'activate' renaming.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/


  1   2   >