Re: [HACKERS] collations in shared catalogs?

2015-03-04 Thread Robert Haas
On Wed, Feb 25, 2015 at 7:54 PM, David Steele da...@pgmasters.net wrote:
 +1 on 128/256 character names.

 /me runs and hides.

 /stands brazenly in the open and volunteers to try it if I don't get
 clobbered within seconds.

I think the question is whether making lots of rows in system catalogs
better is going to have undesirable effects on (a) the size of our
initial on-disk format (i.e. how big an empty database is), (b) the
amount of memory consumed by the syscache and relcaches on workloads
that touch lots of tables/functions/whatever, or (c) CPU consumption
mostly as a result of more cache line accesses for the same operation.
If you can prove those effects are minimal, that'd be a good place to
start.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Another advantage of this is that it would probably make git less
 likely to fumble a rebase.  If there are lots of places in the file
 where we have the same 10 lines in a row with occasional variations,
 rebasing a patch could easily pick the the wrong place to reapply the
 hunk.

That is a really, really good point.

I had been thinking it was a disadvantage of Andrew's proposal that
line breaks would tend to fall in inconsistent places from one entry
to another ... but from this perspective, maybe that's not such a
bad thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-03-04 Thread David Steele
Hi Robert,

On 3/4/15 10:14 AM, Robert Haas wrote:
 On Wed, Feb 25, 2015 at 7:54 PM, David Steele da...@pgmasters.net wrote:
 +1 on 128/256 character names.

 /me runs and hides.

 /stands brazenly in the open and volunteers to try it if I don't get
 clobbered within seconds.
 
 I think the question is whether making lots of rows in system catalogs
 better is going to have undesirable effects on (a) the size of our
 initial on-disk format (i.e. how big an empty database is), (b) the
 amount of memory consumed by the syscache and relcaches on workloads
 that touch lots of tables/functions/whatever, or (c) CPU consumption
 mostly as a result of more cache line accesses for the same operation.
 If you can prove those effects are minimal, that'd be a good place to
 start.

Thanks, that's encouraging.  I've already compiled with NAMEDATALEN=256
and verified that the only failing tests are the ones making sure that
identifier lengths are truncated or fail appropriately when they are 
63.  I'm sure lots of people have done that before and gotten the same
result.

I'm currently investigating the issues that you've identified above
since they constitute the real problem with increasing NAMEDATALEN.
Once I have some answers I'll send a proposal to hackers.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Weirdly pesimistic estimates in optimizer

2015-03-04 Thread Robert Haas
On Mon, Mar 2, 2015 at 2:19 PM, David Kubečka kubecka@gmail.com wrote:
 The question is why optimizer, or rather the cost estimator, produced so
 much different estimates upon very small change in input. Moreover it seems
 that the main culprit of bad estimates isn't actually directly related to
 outer table, but rather to inner table. Just compare estimates for the two
 index scans:

 With 'random_fk_dupl':
  -  Index Scan using facts_fk_idx on facts  (cost=0.42..5.75
 rows=100 width=15) (actual time=0.009..0.117 rows=98 loops=100)
 With 'random_fk_uniq':
  -  Index Scan using facts_fk_idx on facts  (cost=0.42..214.26
 rows=100 width=15) (actual time=0.007..0.109 rows=98 loops=100)

Whoa.  That's pretty dramatic.  Am I correctly understanding that the
two tables contain *exactly* the same data?  Could the issue be that
the optimizer thinks that one of the tables is thought to have a
higher logical-to-physical correlation than the other (see
pg_stats.correlation)?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Andres Freund
Hi,

On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
 I've been discussing this with a few folks outside of the PG community
 (Debian and Openwall people specifically) and a few interesting ideas
 have come out of that which might be useful to discuss.
 
 The first is a don't break anything approach which would move the
 needle between network data sensitivity and on-disk data sensitivity
 a bit back in the direction of making the network data more sensitive.

I think that's a really bad tradeoff for pg. There's pretty good reasons
not to encrypt database connections. I don't think you really can
compare routinely encrypted stuff like imap and submission with
pg. Neither is it as harmful to end up with leaked hashes for database
users as it is for a email provider's authentication database.

 A lot of discussion has been going on with SCRAM and SASL, which is all
 great, but that means we end up with a dependency on SASL or we have to
 reimplement SCRAM (which I've been thinking might not be a bad idea-
 it's actually not that hard), but another suggestion was made which may
 be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
 RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
 have OpenSSL and therefore this wouldn't create any new dependencies and
 might be slightly simpler to implement.

We don't have a hard dependency openssl, so I can't really see that
being a fully viable alternative to md5 TBH.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Magnus Hagander
On Wed, Mar 4, 2015 at 5:03 PM, Stephen Frost sfr...@snowman.net wrote:

 Magnus,

 * Magnus Hagander (mag...@hagander.net) wrote:
  On Wed, Mar 4, 2015 at 4:52 PM, Stephen Frost sfr...@snowman.net
 wrote:
   A lot of discussion has been going on with SCRAM and SASL, which is all
   great, but that means we end up with a dependency on SASL or we have to
   reimplement SCRAM (which I've been thinking might not be a bad idea-
   it's actually not that hard), but another suggestion was made which may
 
  I'd really rather not add a dependency on SASL if we can avoid it. I
  haven't read up on SCRAM, but if it's reasonable enough to reimplement -
 or
  if there is a BSD licensed implementation that we can import into our own
  sourcetree without adding a dependency on SASL, that sounds like a good
 way
  to proceed.

 I actually like the idea of supporting SASL generally, but I agree that
 we don't really want to force it as a dependency.  I've started looking
 around for BSD-licensed SCRAM implementations and will update with any I
 find that are worthwhile to review.

   be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
   RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We
 already
   have OpenSSL and therefore this wouldn't create any new dependencies
 and
   might be slightly simpler to implement.
 
  OpenSSL is not a *requirement* today, it's an optional dependency.  Given
  it's license we really can't make it a mandatory requirement I think. So
 if
  we go down that route, we still leave md5 in there as the one that works
  everywhere.
 
  Also AFAICT TLS-SRP actually requires the connection to be over TLS - so
  are you suggesting that TLS becomes mandatory?
 
  It sounds like something that could be interesting to have, but not as a
  solution to the md5 problem, imo.

 No, I'm not suggesting that OpenSSL or TLS become mandatory but was
 thinking it might be good alternative as a middle-ground between full
 client-and-server side certificates and straight password-based auth
 (which is clearly why it was invented in the first place) and so, yes,
 md5 would still have to be kept around, but we'd at least be able to
 deprecate it and tell people Use TLS-SRP if you really want to use
 passwords and care about network security.

 SCRAM doesn't actually fix the issue with network connection hijacking
 or eavesdropping, except to the extent that it protects the password
 itself, and so we might want to recommend, for people who are worried
 about network-based attacks, using TLS-SRP.


Assuming we do implement SCRAM, what does TLS-SRP give us that we wouldn't
get by just using SCRAM over a TLS connection?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Robert Haas
On Thu, Feb 19, 2015 at 5:53 AM, Marko Tiikkaja ma...@joh.to wrote:
 Commit 79af9a1d2668c9edc8171f03c39e7fed571eeb98 changed xpath handling with
 regard to namespaces, and it seems to be fixing an actual issue. However, it
 was also backpatched to all branches despite it breaking for example code
 like this:

 do $$
 declare
 _x xml;
 begin
 _x := (xpath('/x:Foo/x:Bar', xml 'Foo
 xmlns=teh:urnBarBaz1/BazBat2/Bat/Bar/Foo',
 array[['x','teh:urn']]))[1];
 raise notice '%', xpath('/Bar/Baz/text()', _x);
 raise notice '%', xpath('/Bar/Bat/text()', _x);
 end
 $$;

 The problem is that there's no way to write the code like this in such a way
 that it would work on both versions.  If I add the namespace, it's broken on
 9.1.14.  Without it it's broken on 9.1.15.

That certainly sucks.

 I'm not sure how changing behavior like this in a minor release was
 considered acceptable.

I'm guessing that the fact that it changes behavior in cases like this
wasn't recognized, but I suppose Peter will have to be the one to
comment on that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Combining Aggregates

2015-03-04 Thread Robert Haas
On Wed, Mar 4, 2015 at 4:41 AM, David Rowley dgrowle...@gmail.com wrote:
 This thread mentions parallel queries as a use case, but that means
 passing data between processes, and that requires being able to
 serialize and deserialize the aggregate state somehow. For actual data
 types that's not overly difficult I guess (we can use in/out functions),
 but what about aggretates using 'internal' state? I.e. aggregates
 passing pointers that we can't serialize?

 This is a good question. I really don't know the answer to it as I've not
 looked at the Robert's API for passing data between backends yet.

 Maybe Robert or Amit can answer this?

I think parallel aggregation will probably require both the
infrastructure discussed here, namely the ability to combine two
transition states into a single transition state, and also the ability
to serialize and de-serialize transition states, which has previously
been discussed in the context of letting hash aggregates spill to
disk.  My current thinking is that the parallel plan will look
something like this:

HashAggregateFinish
- Funnel
- HashAggregatePartial
  - PartialHeapScan

So the workers will all pull from a partial heap scan and each worker
will aggregate its own portion of the data.  Then it'll need to pass
the results of that step back to the master, which will aggregate the
partial results and produce the final results.  I'm guessing that if
we're grouping on, say, column a, the HashAggregatePartial nodes will
kick out 2-column tuples of the form (value for a, serialized
transition state for group with that value for a).

Of course this is all pie in the sky right now, but I think it's a
pretty good bet that partial aggregation is a useful thing to be able
to do.  Postgres-XC put a bunch of work into this, too, so there's
lots of people saying hey, we want that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Andres Freund
On 2015-03-04 11:06:33 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
   The first is a don't break anything approach which would move the
   needle between network data sensitivity and on-disk data sensitivity
   a bit back in the direction of making the network data more sensitive.
  
  I think that's a really bad tradeoff for pg. There's pretty good reasons
  not to encrypt database connections. I don't think you really can
  compare routinely encrypted stuff like imap and submission with
  pg. Neither is it as harmful to end up with leaked hashes for database
  users as it is for a email provider's authentication database.
 
 I'm confused..  The paragraph you reply to here discusses an approach
 which doesn't include encrypting the database connection.

An increase in network data sensitivity also increases the need for
encryption.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 19, 2015 at 5:53 AM, Marko Tiikkaja ma...@joh.to wrote:
 I'm not sure how changing behavior like this in a minor release was
 considered acceptable.

 I'm guessing that the fact that it changes behavior in cases like this
 wasn't recognized, but I suppose Peter will have to be the one to
 comment on that.

It was considered to be a bug fix; more, given the few complaints about
the clearly-broken old behavior, we thought it was a fix that would affect
few people, and them positively.  Sorry you're not happy about it, but
these things are always judgment calls.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comparing primary/HS standby in tests

2015-03-04 Thread Jeff Janes
On Tue, Mar 3, 2015 at 7:49 AM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi,

 I've regularly wished we had automated tests that setup HS and then
 compare primary/standby at the end to verify replay worked
 correctly.

 Heikki's page comparison tools deals with some of that verification, but
 it's really quite expensive and doesn't care about runtime only
 differences. I.e. it doesn't test HS at all.

 I every now and then run installcheck against a primary, verify that
 replay works without errors, and then compare pg_dumpall from both
 clusters. Unfortunately that currently requires hand inspection of
 dumps, there are differences like:
 -SELECT pg_catalog.setval('default_seq', 1, true);
 +SELECT pg_catalog.setval('default_seq', 33, true);

 The reason these differences is that the primary increases the
 sequence's last_value by 1, but temporarily sets it to +SEQ_LOG_VALS
 before XLogInsert(). So the two differ.

 Does anybody have a good idea how to get rid of that difference? One way
 to do that would be to log the value the standby is sure to have - but
 that's not entirely trivial.

 I'd very much like to add a automated test like this to the tree, but I
 don't see wa way to do that sanely without a comparison tool...


Couldn't we just arbitrarily exclude sequence internal states from the
comparison?

That wouldn't work where the standby has been promoted and then used in a
way that draws on the sequence (with the same workload being put through
the now-promoted standby and the original-master), though, but I don't
think that that was what you were asking about.

How many similar issues have you seen?

In the case where you have a promoted replica and put the same through
workflow through both it and the master, I've seen pg_dump -s dump
objects in different orders, for no apparent reason.  That is kind of
annoying, but I never traced it back to the cause (nor have I excluded
PEBCAK as the real cause).

Cheers,

Jeff


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Magnus Hagander
On Wed, Mar 4, 2015 at 4:52 PM, Stephen Frost sfr...@snowman.net wrote:


 A lot of discussion has been going on with SCRAM and SASL, which is all
 great, but that means we end up with a dependency on SASL or we have to
 reimplement SCRAM (which I've been thinking might not be a bad idea-
 it's actually not that hard), but another suggestion was made which may



I'd really rather not add a dependency on SASL if we can avoid it. I
haven't read up on SCRAM, but if it's reasonable enough to reimplement - or
if there is a BSD licensed implementation that we can import into our own
sourcetree without adding a dependency on SASL, that sounds like a good way
to proceed.



 be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
 RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
 have OpenSSL and therefore this wouldn't create any new dependencies and
 might be slightly simpler to implement.



OpenSSL is not a *requirement* today, it's an optional dependency.  Given
it's license we really can't make it a mandatory requirement I think. So if
we go down that route, we still leave md5 in there as the one that works
everywhere.

Also AFAICT TLS-SRP actually requires the connection to be over TLS - so
are you suggesting that TLS becomes mandatory?

It sounds like something that could be interesting to have, but not as a
solution to the md5 problem, imo.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
Bruce, all,

I've been discussing this with a few folks outside of the PG community
(Debian and Openwall people specifically) and a few interesting ideas
have come out of that which might be useful to discuss.

The first is a don't break anything approach which would move the
needle between network data sensitivity and on-disk data sensitivity
a bit back in the direction of making the network data more sensitive.

this approach looks like this: pre-determine and store the values (on a
per-user basis, so a new field in pg_authid or some hack on the existing
field) which will be sent to the client in the AuthenticationMD5Password
message.  Further, calculate a random salt to be used when storing data
in pg_authid.  Then, for however many variations we feel are necessary,
calculate and store, for each AuthenticationMD5Password value:

md5_challenge, hash(salt || response)

We wouldn't store 4 billion of these, of course, which means that the
challenge / response system becomes less effective on a per-user basis.
We could, however, store X number of these and provide a lock-out
mechanism (something users have asked after for a long time..) which
would make it likely that the account would be locked before the
attacker was able to gain access.  Further, an attacker with access to
the backend still wouldn't see the user's cleartext password, nor would
we store the cleartext password or a token in pg_authid which could be
directly used for authentication, and we don't break the wireline
protocol or existing installations (since we could detect that the
pg_authid entry has the old-style and simply 'upgrade' it).

That's probably the extent of what we could do to improve the current
'md5' approach without breaking the wireline protocol or existing stored
data.

A lot of discussion has been going on with SCRAM and SASL, which is all
great, but that means we end up with a dependency on SASL or we have to
reimplement SCRAM (which I've been thinking might not be a bad idea-
it's actually not that hard), but another suggestion was made which may
be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
have OpenSSL and therefore this wouldn't create any new dependencies and
might be slightly simpler to implement.

Thoughts?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Abbreviated keys for Numeric

2015-03-04 Thread Robert Haas
On Mon, Feb 23, 2015 at 5:59 AM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 Tomas == Tomas Vondra tomas.von...@2ndquadrant.com writes:
  Tomas Interesting, but I think Gavin was asking about how much
  Tomas variation was there for each tested case (e.g. query executed on
  Tomas the same code / dataset). And in those cases the padding /
  Tomas alignment won't change, and thus the effects you describe won't
  Tomas influence the results, no?

 My point is exactly the fact that since the result is not affected, this
 variation between runs of the same code is of no real relevance to the
 question of whether a given change in performance can properly be
 attributed to a patch.

 Put it this way: suppose I have a test that when run repeatedly with no
 code changes takes 6.10s (s=0.025s), and I apply a patch that changes
 that to 6.26s (s=0.025s). Did the patch have an impact on performance?

 Now suppose that instead of applying the patch I insert random amounts
 of padding in an unused function and find that my same test now takes a
 mean of 6.20s (s=0.058s) when I take the best timing for each padding
 size and calculate stats across sizes. Now it looks obvious that the
 actual code of the patch probably wasn't responsible for any change...

 The numbers used here aren't theoretical; they are obtained by testing a
 single query - select * from d_flt order by v offset 1000 where
 d_flt contains 5 million float8 values - over 990 times with 33
 different random padding sizes (uniform in 0-32767). Here's a scatter
 plot, with 3 runs of each padding size so you can see the repeatability:
 http://tinyurl.com/op9qg8a

Neat chart.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2015-03-04 11:06:33 -0500, Stephen Frost wrote:
  * Andres Freund (and...@2ndquadrant.com) wrote:
   On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
The first is a don't break anything approach which would move the
needle between network data sensitivity and on-disk data sensitivity
a bit back in the direction of making the network data more sensitive.
   
   I think that's a really bad tradeoff for pg. There's pretty good reasons
   not to encrypt database connections. I don't think you really can
   compare routinely encrypted stuff like imap and submission with
   pg. Neither is it as harmful to end up with leaked hashes for database
   users as it is for a email provider's authentication database.
  
  I'm confused..  The paragraph you reply to here discusses an approach
  which doesn't include encrypting the database connection.
 
 An increase in network data sensitivity also increases the need for
 encryption.

Ok, I see what you're getting at there, though our existing md5
implementation with no lock-out mechanism or ability to deal with
hijacking isn't exactly making us all that safe when it comes to network
based attacks.  The best part about md5 is that we don't send the user's
password over the wire in the clear, the actual challenge/response piece
is not considered terribly secure today, nor is the salt+password we use
for pg_authid for that matter. :/

SCRAM won't fix network connection hijacking but it does address replay
attacks better than our current challenge/response system (at least the
example in the RFC uses a 16-byte base64-encoded salt)) and the on-disk
storage risk (multiple iterations are supported), and multiple hashing
algorithms can be supported including ones much better than what we
support today (eg: SHA256) which applies to both network and on-disk
vectors.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] improve pgbench syntax error messages

2015-03-04 Thread Fabien COELHO



As I mentioned on the other thread, I'd really like to get this into a
better format, where each error message is on one line.  Looking at
that, you can't tell whether you've got one mistake, two mistakes, or
three mistakes.


Indeed. Here is a v2.

  sh ./pgbench -f bad.sql
  bad.sql:3: syntax error at column 23 in command set
  \set aid (1021 * :id) %
^ error found here

  sh ./pgbench -f bad2.sql
  bad2.sql:1: unexpected argument (x) at column 25 in command setrandom
 \setrandom id 1 1000 x
  ^ error found here

  sh ./pgbench -f bad3.sql
  bad3.sql:1: too many arguments (gaussian) at column 35 in command setrandom
  \setrandom foo 1 10 gaussian 10.3 1
^ error found here

 sh ./pgbench -f bad4.sql
  bad4.sql:1: missing threshold argument (exponential) in command setrandom
  \setrandom foo 1 10 exponential


I think that transforming unexpected garbage in errors would be a good 
move, even if this breaks backwards compatibility (currently a warning is 
printed about ignored extra stuff).


--
Fabien.diff --git a/contrib/pgbench/exprscan.l b/contrib/pgbench/exprscan.l
index 4c9229c..dda2635 100644
--- a/contrib/pgbench/exprscan.l
+++ b/contrib/pgbench/exprscan.l
@@ -57,24 +57,36 @@ space			[ \t\r\f]
 }
 %%
 
+static char *expr_source = NULL;
+static int expr_lineno = 0;
+static char *expr_full_line = NULL;
+static char *expr_command = NULL;
+static int expr_col = 0;
+
 void
 yyerror(const char *message)
 {
-	/* yyline is always 1 as pgbench calls the parser for each line...
-	 * so the interesting location information is the column number */
-	fprintf(stderr, %s at column %d\n, message, yycol);
-	/* go on to raise the error from pgbench with more information */
-	/* exit(1); */
+	syntax_error(expr_source, expr_lineno, expr_full_line, expr_command,
+ message, NULL, expr_col + yycol);
 }
 
 /*
  * Called before any actual parsing is done
  */
 void
-expr_scanner_init(const char *str)
+expr_scanner_init(const char *str, const char *source,
+  const int lineno, const char *line,
+  const char *cmd, const int ecol)
 {
 	Size	slen = strlen(str);
 
+	/* save context informations for error messages */
+	expr_source = (char *) source;
+	expr_lineno = (int) lineno;
+	expr_full_line = (char *) line;
+	expr_command = (char *) cmd;
+	expr_col = (int) ecol;
+
 	/*
 	 * Might be left over after error
 	 */
@@ -102,4 +114,9 @@ expr_scanner_finish(void)
 {
 	yy_delete_buffer(scanbufhandle);
 	pg_free(scanbuf);
+	expr_source = NULL;
+	expr_lineno = 0;
+	expr_full_line = NULL;
+	expr_command = NULL;
+	expr_col = 0;
 }
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 706fdf5..3e50bae 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -291,6 +291,7 @@ typedef struct
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 } Command;
 
@@ -2189,6 +2190,28 @@ parseQuery(Command *cmd, const char *raw_sql)
 	return true;
 }
 
+void syntax_error(const char *source, const int lineno,
+  const char *line, const char *command,
+  const char *msg, const char *more, const int column)
+{
+	fprintf(stderr, %s:%d: %s, source, lineno, msg);
+	if (more)
+		fprintf(stderr,  (%s), more);
+	if (column != -1)
+		fprintf(stderr,  at column %d, column);
+	fprintf(stderr,  in command \%s\\n, command);
+	if (line) {
+		fprintf(stderr, %s\n, line);
+		if (column != -1) {
+			int i = 0;
+			for (i = 0; i  column - 1; i++)
+fprintf(stderr,  );
+			fprintf(stderr, ^ error found here\n);
+		}
+	}
+	exit(1);
+}
+
 /* Parse a command; return a Command struct, or NULL if it's a comment */
 static Command *
 process_commands(char *buf, const char *source, const int lineno)
@@ -2200,6 +2223,11 @@ process_commands(char *buf, const char *source, const int lineno)
 	char	   *p,
 			   *tok;
 
+/* error message generation helper */
+#define SYNTAX_ERROR(msg, more, col)	\
+	syntax_error(source, lineno, my_commands-line,		\
+ my_commands-argv[0], msg, more, col)
+
 	/* Make the string buf end at the next newline */
 	if ((p = strchr(buf, '\n')) != NULL)
 		*p = '\0';
@@ -2228,11 +2256,13 @@ process_commands(char *buf, const char *source, const int lineno)
 		j = 0;
 		tok = strtok(++p, delim);
 
+		/* stop set argument parsing the variable name */
 		if (tok != NULL  pg_strcasecmp(tok, set) == 0)
 			max_args = 2;
 
 		while (tok != NULL)
 		{
+			my_commands-cols[j] = tok - buf + 1;
 			my_commands-argv[j++] = pg_strdup(tok);
 			my_commands-argc++;
 			if (max_args = 0  my_commands-argc = max_args)
@@ -2250,9 +2280,9 @@ process_commands(char *buf, const char *source, const int lineno)
 
 			if (my_commands-argc  4)
 			{
-fprintf(stderr, %s: missing argument\n, 

Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Andres Freund
On 2015-03-04 09:55:01 -0500, Robert Haas wrote:
 On Wed, Mar 4, 2015 at 9:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder if we should have a tool in our repository to help people
 edit the file.  So instead of going in there yourself and changing
 things by hand, or writing your own script, you can do:
 
 updatepgproc.pl --oid 5678 provolatile=v
 
 or
 
 updatepgpproc.pl --name='.*xact.*' prowhatever=someval
 
 Regardless of what format we end up with, that seems like it would
 make things easier.

The stuff I've started to work on basically allows to load the stuff
that Catalog.pm provides (in an extended format), edit it in memory, and
then serialize it again. So such a thing could relatively easily be
added if somebody wants to do so.  I sure hope though that the need for
it will become drastically lower with the new format.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c

2015-03-04 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Sun, Mar 1, 2015 at 6:58 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Hm, why not. That would remove all inconsistencies between the parser
  and the autovacuum code path. Perhaps something like the attached
  makes sense then?
 
 I really don't see this patch, or any of the previous ones, as solving
 any actual problem.  There's no bug here, and no reason to suspect
 that future code changes would be particularly like to introduce one.
 Assertions are a great way to help developers catch coding mistakes,
 but it's a real stretch to think that a developer is going to add a
 new syntax for ANALYZE that involves setting options proper to VACUUM
 and not notice it.

Yeah, I haven't been terribly excited about it for the same reasons.
Had Michael's latest patch meant that we didn't need to pass VacuumStmt
down into the other functions then I might have been a bit more behind
it, but as is we seem to be simply duplicating everything except the
actual Node entry in the struct, which kind of missed the point.

 This thread started out because Michael read an assertion in the code
 and misunderstood what that assertion was trying to guard against.
 I'm not sure there's any code change needed here at all, but if there
 is, I suggest we confine it to adding a one-line comment above that
 assertion clarifying its purpose, like /* check that parser didn't
 produce ANALYZE FULL or ANALYZE FREEZE */.

I'd be fine with that.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 Hi,
 
 On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
  I've been discussing this with a few folks outside of the PG community
  (Debian and Openwall people specifically) and a few interesting ideas
  have come out of that which might be useful to discuss.
  
  The first is a don't break anything approach which would move the
  needle between network data sensitivity and on-disk data sensitivity
  a bit back in the direction of making the network data more sensitive.
 
 I think that's a really bad tradeoff for pg. There's pretty good reasons
 not to encrypt database connections. I don't think you really can
 compare routinely encrypted stuff like imap and submission with
 pg. Neither is it as harmful to end up with leaked hashes for database
 users as it is for a email provider's authentication database.

I'm confused..  The paragraph you reply to here discusses an approach
which doesn't include encrypting the database connection.

  A lot of discussion has been going on with SCRAM and SASL, which is all
  great, but that means we end up with a dependency on SASL or we have to
  reimplement SCRAM (which I've been thinking might not be a bad idea-
  it's actually not that hard), but another suggestion was made which may
  be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
  RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
  have OpenSSL and therefore this wouldn't create any new dependencies and
  might be slightly simpler to implement.
 
 We don't have a hard dependency openssl, so I can't really see that
 being a fully viable alternative to md5 TBH.

Right, agreed, that wasn't intended to be a complete replacement for md5
but rather an additional auth mechanism we could get nearly for free
which would provide password-based authentication with network-level
encryption for users who are worried about network-based attacks (and
therefore want to or are already using TLS, as Debian is configured to
do by default...).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Marko Tiikkaja

On 3/4/15 6:19 PM, I wrote:

On 3/4/15 5:26 PM, Tom Lane wrote:

It was considered to be a bug fix; more, given the few complaints about
the clearly-broken old behavior, we thought it was a fix that would affect
few people, and them positively.


Yeah, but these things usually go the other way.  This has been broken
for 10 years but nobody noticed, so we're not going to fix this


Sorry, that should have said we're not going to fix this *in the back 
branches*.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Mike Rylander
On Thu, Feb 19, 2015 at 5:53 AM, Marko Tiikkaja ma...@joh.to wrote:

 Hi,

 Commit 79af9a1d2668c9edc8171f03c39e7fed571eeb98 changed xpath handling
 with regard to namespaces, and it seems to be fixing an actual issue.
 However, it was also backpatched to all branches despite it breaking for
 example code like this:

 do $$
 declare
 _x xml;
 begin
 _x := (xpath('/x:Foo/x:Bar', xml 'Foo 
 xmlns=teh:urnBarBaz1/BazBat2/Bat/Bar/Foo',
 array[['x','teh:urn']]))[1];
 raise notice '%', xpath('/Bar/Baz/text()', _x);
 raise notice '%', xpath('/Bar/Bat/text()', _x);
 end
 $$;

 The problem is that there's no way to write the code like this in such a
 way that it would work on both versions.  If I add the namespace, it's
 broken on 9.1.14.  Without it it's broken on 9.1.15.

 I'm now thinking of adding a workaround which strips namespaces, but that
 doesn't seem to be easy to do, even with PL/Perl.  Is there a better
 workaround here that I'm not seeing?


FWIW, I've been working around the bug fixed in that commit for ages by
spelling my xpath like this:

  xpath('/*[local-name()=Bar]/*[local-name()=Baz]/text()', data)

I've modularized my XML handling functions so the source of 'data' is
immaterial -- maybe it's a full document, maybe it's a fragment from a
previous xpath() call -- and the referenced commit is going to make correct
XPATH much more sane, readable, and maintainable.  I, for one, welcome it
wholeheartedly.

HTH,

--Mike


Re: [HACKERS] pg_upgrade and rsync

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 01:53:47PM +0300, Vladimir Borodin wrote:
 After running initdb to create the new master, but before running
 pg_upgrade, modify the new master's postgresql.conf and change wal_level
 = hot_standby.  (Don't modify pg_hba.conf at this stage.)
 
 
 
 That does not help. The reason is that pg_upgrade sets 'Current wal_level
 setting: minimal' in control-file, and it does not depend on what is set in
 postgresql.conf before running pg_upgrade. Details could be seen here - 
 http://
 pastie.org/9998671.

Well, what is happening is that the pg_resetxlog commands we run inside
pg_upgrade set the pg_controldata file's wal_level to minimal, but as
you saw, starting the server sets the pg_controldata properly. 
pg_resetxlog is not changing the WAL files at all, just the control
file.

 The workaround for this is to start  and cleanly shut down postgres on master
 after running pg_upgrade but before running rsync. After that there would be a
 good control-file for streaming replication and rsync to replica can be done.

You are correct that a pg_controldata file is copied over that has
wal_level=minimal, but that should not be a problem.

 But it could not be done with --size-only key, because control-file is of 
 fixed
 size and rsync would skip it. Or may be everything should be copied with
 --size-only and control-file should be copied without this option.

Well, what happens is that there is no _new_ standby pg_controldata
file, so it is copied fully from the new master.  Are you running initdb
to create the new standby --- you shouldn't be doing that as the rsync
will do that for you.  Also, are you cleanly shutting down all the
servers, or using pg_ctl -m immediate?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Marko Tiikkaja

On 3/4/15 5:26 PM, Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

On Thu, Feb 19, 2015 at 5:53 AM, Marko Tiikkaja ma...@joh.to wrote:

I'm not sure how changing behavior like this in a minor release was
considered acceptable.



I'm guessing that the fact that it changes behavior in cases like this
wasn't recognized, but I suppose Peter will have to be the one to
comment on that.


It was considered to be a bug fix; more, given the few complaints about
the clearly-broken old behavior, we thought it was a fix that would affect
few people, and them positively.


Yeah, but these things usually go the other way.  This has been broken 
for 10 years but nobody noticed, so we're not going to fix this is the 
answer I'm more used to seeing.  And frankly, even though I remember 
getting the wrong end of that hose a few times, I prefer that to 
breaking apps in minor releases.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 10:52:30AM -0500, Stephen Frost wrote:
 The first is a don't break anything approach which would move the
 needle between network data sensitivity and on-disk data sensitivity
 a bit back in the direction of making the network data more sensitive.
 
 this approach looks like this: pre-determine and store the values (on a
 per-user basis, so a new field in pg_authid or some hack on the existing
 field) which will be sent to the client in the AuthenticationMD5Password
 message.  Further, calculate a random salt to be used when storing data
 in pg_authid.  Then, for however many variations we feel are necessary,
 calculate and store, for each AuthenticationMD5Password value:
 
 md5_challenge, hash(salt || response)
 
 We wouldn't store 4 billion of these, of course, which means that the
 challenge / response system becomes less effective on a per-user basis.
 We could, however, store X number of these and provide a lock-out
 mechanism (something users have asked after for a long time..) which
 would make it likely that the account would be locked before the
 attacker was able to gain access.  Further, an attacker with access to
 the backend still wouldn't see the user's cleartext password, nor would
 we store the cleartext password or a token in pg_authid which could be
 directly used for authentication, and we don't break the wireline
 protocol or existing installations (since we could detect that the
 pg_authid entry has the old-style and simply 'upgrade' it).

What does storing multiple hash(password || stoarage_salt) values do for
us that session_salt doesn't already do?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comparing primary/HS standby in tests

2015-03-04 Thread Andres Freund
On 2015-03-04 08:41:23 -0800, Jeff Janes wrote:
 Couldn't we just arbitrarily exclude sequence internal states from the
 comparison?

Not sure what you mean? You mean just not dump them? I guess we could by
editing the contents of a custom format dump? A bit annoying to have a
script doing that...

 How many similar issues have you seen?

That's usually the only difference.

 In the case where you have a promoted replica and put the same through
 workflow through both it and the master, I've seen pg_dump -s dump
 objects in different orders, for no apparent reason.  That is kind of
 annoying, but I never traced it back to the cause (nor have I excluded
 PEBCAK as the real cause).

I'm not surprised. Independent runs - which you seem to be describing -
are quite dependent on on-disk order, and effects of concurrency. Oids
get assigned in different orders and such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 11:36:23AM -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  On 2015-03-04 11:06:33 -0500, Stephen Frost wrote:
   * Andres Freund (and...@2ndquadrant.com) wrote:
On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
 The first is a don't break anything approach which would move the
 needle between network data sensitivity and on-disk data 
 sensitivity
 a bit back in the direction of making the network data more sensitive.

I think that's a really bad tradeoff for pg. There's pretty good reasons
not to encrypt database connections. I don't think you really can
compare routinely encrypted stuff like imap and submission with
pg. Neither is it as harmful to end up with leaked hashes for database
users as it is for a email provider's authentication database.
   
   I'm confused..  The paragraph you reply to here discusses an approach
   which doesn't include encrypting the database connection.
  
  An increase in network data sensitivity also increases the need for
  encryption.
 
 Ok, I see what you're getting at there, though our existing md5
 implementation with no lock-out mechanism or ability to deal with
 hijacking isn't exactly making us all that safe when it comes to network
 based attacks.  The best part about md5 is that we don't send the user's
 password over the wire in the clear, the actual challenge/response piece
 - here is where I was lost
 is not considered terribly secure today, nor is the salt+password we use
 for pg_authid for that matter. :/

Can you please rephrase the last sentence as it doesn't make sense to
me?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compress method for spgist - 2

2015-03-04 Thread Paul Ramsey
On Wed, Feb 25, 2015 at 6:13 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 In the original post on this, you mentioned that the PostGIS guys planned to
 use this to store polygons, as bounding boxes
 (http://www.postgresql.org/message-id/5447b3ff.2080...@sigaev.ru). Any idea
 how would that work?

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was
sufficiently rare compared to the overall volume of data, to not be an
issue. Oddly enough this big hammer has worked in other
implementations at least passable well
(https://github.com/mapserver/mapserver/blob/branch-7-0/maptree.c#L261)

P.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 10:52:30AM -0500, Stephen Frost wrote:
  The first is a don't break anything approach which would move the
  needle between network data sensitivity and on-disk data sensitivity
  a bit back in the direction of making the network data more sensitive.
  
  this approach looks like this: pre-determine and store the values (on a
  per-user basis, so a new field in pg_authid or some hack on the existing
  field) which will be sent to the client in the AuthenticationMD5Password
  message.  Further, calculate a random salt to be used when storing data
  in pg_authid.  Then, for however many variations we feel are necessary,
  calculate and store, for each AuthenticationMD5Password value:
  
  md5_challenge, hash(salt || response)
  
  We wouldn't store 4 billion of these, of course, which means that the
  challenge / response system becomes less effective on a per-user basis.
  We could, however, store X number of these and provide a lock-out
  mechanism (something users have asked after for a long time..) which
  would make it likely that the account would be locked before the
  attacker was able to gain access.  Further, an attacker with access to
  the backend still wouldn't see the user's cleartext password, nor would
  we store the cleartext password or a token in pg_authid which could be
  directly used for authentication, and we don't break the wireline
  protocol or existing installations (since we could detect that the
  pg_authid entry has the old-style and simply 'upgrade' it).
 
 What does storing multiple hash(password || stoarage_salt) values do for
 us that session_salt doesn't already do?

By storing a hash of the result of the challenge/response, we wouldn't
be susceptible to attacks where the user has gained access to the
contents of pg_authid because the values there would not be (directly)
useful for authentication.  Today, an attacker can take what's in
pg_authid and directly use it to authenticate (which is what the
interwebs are complaining about).

We wouldn't want to do that for just a single value though because then
there wouldn't be any value to the challenge/response system (which is
intended to prevent replay attacks where the attacker has sniffed a
value from the network and then uses that value to authenticate
themselves).

The only way we can keep the session salt random without breaking the
wireline protocol is to keep the raw data necessary for authentication
in pg_authid (as we do now) since we'd need that information to recreate
the results of the random session salt+user-hash for comparison.

This is essentially a middle ground which maintains the existing
wireline protocol while changing what is in pg_authid to be something
that an attacker can't trivially use to authenticate.  It is not a
proper solution as that requires changing the wireline protocol (or,
really, extending it with another auth method that's better).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch: raise default for max_wal_segments to 1GB

2015-03-04 Thread Greg Stark
On Tue, Mar 3, 2015 at 7:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Now, if we were to change the server so that it *refused* settings that
 didn't have a unit, that argument would become moot.  But I'm not going
 to defend the castle against the villagers who will show up if you do
 that.

That might be something we could get away with eventually if we put in
a warning like this for a few years:

WARNING: Using default units of seconds for GUC tcp_keepalives

Though we'll have to deal with the 0 means system default and -1
means really really disabled stuff.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 11:36:23AM -0500, Stephen Frost wrote:
  * Andres Freund (and...@2ndquadrant.com) wrote:
   On 2015-03-04 11:06:33 -0500, Stephen Frost wrote:
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
  The first is a don't break anything approach which would move the
  needle between network data sensitivity and on-disk data 
  sensitivity
  a bit back in the direction of making the network data more 
  sensitive.
 
 I think that's a really bad tradeoff for pg. There's pretty good 
 reasons
 not to encrypt database connections. I don't think you really can
 compare routinely encrypted stuff like imap and submission with
 pg. Neither is it as harmful to end up with leaked hashes for database
 users as it is for a email provider's authentication database.

I'm confused..  The paragraph you reply to here discusses an approach
which doesn't include encrypting the database connection.
   
   An increase in network data sensitivity also increases the need for
   encryption.
  
  Ok, I see what you're getting at there, though our existing md5
  implementation with no lock-out mechanism or ability to deal with
  hijacking isn't exactly making us all that safe when it comes to network
  based attacks.  The best part about md5 is that we don't send the user's
  password over the wire in the clear, the actual challenge/response piece
  - here is where I was lost
  is not considered terribly secure today, nor is the salt+password we use
  for pg_authid for that matter. :/
 
 Can you please rephrase the last sentence as it doesn't make sense to
 me?

The best part of the existing authentication method we call md5 is
that the user's password is never sent over the network in the clear.

The challenge/response implementation we have only provides for 4 bytes
of hash (or around four billion possible permutations) which is not very
secure today (as compared to the 16-character base64 salt used in SCRAM,
which is 16^64 or 2^96 instead of 2^32).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] File based Incremental backup v8

2015-03-04 Thread Marco Nenciarini
Hi Fujii,

Il 03/03/15 11:48, Fujii Masao ha scritto:
 On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 Il 02/03/15 14:21, Fujii Masao ha scritto:
 On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 Hi,

 I've attached an updated version of the patch.

 basebackup.c:1565: warning: format '%lld' expects type 'long long
 int', but argument 8 has type '__off_t'
 basebackup.c:1565: warning: format '%lld' expects type 'long long
 int', but argument 8 has type '__off_t'
 pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code


 I'll add the an explicit cast at that two lines.

 When I applied three patches and compiled the code, I got the above 
 warnings.

 How can we get the full backup that we can use for the archive recovery, 
 from
 the first full backup and subsequent incremental backups? What commands 
 should
 we use for that, for example? It's better to document that.


 I've sent a python PoC that supports the plain format only (not the tar one).
 I'm currently rewriting it in C (with also the tar support) and I'll send a 
 new patch containing it ASAP.
 
 Yeah, if special tool is required for that purpose, the patch should include 
 it.
 

I'm working on it. The interface will be exactly the same of the PoC script 
I've attached to 

54c7cdad.6060...@2ndquadrant.it

 What does 1 of the heading line in backup_profile mean?


 Nothing. It's a version number. If you think it's misleading I will remove 
 it.
 
 A version number of file format of backup profile? If it's required for
 the validation of backup profile file as a safe-guard, it should be included
 in the profile file. For example, it might be useful to check whether
 pg_basebackup executable is compatible with the source backup that
 you specify. But more info might be needed for such validation.
 

The current implementation bail out with an error if the header line is 
different from what it expect.
It also reports and error if the 2nd line is not the start WAL location. That's 
all that pg_basebackup needs to start a new incremental backup. All the other 
information are useful to reconstruct a full backup in case of an incremental 
backup, or maybe to check the completeness of an archived full backup.
Initially the profile was present only in incremental backups, but after some 
discussion on list we agreed to always write it.

 Sorry if this has been already discussed so far. Why is a backup profile 
 file
 necessary? Maybe it's necessary in the future, but currently seems not.

 It's necessary because it's the only way to detect deleted files.
 
 Maybe I'm missing something. Seems we can detect that even without a profile.
 For example, please imagine the case where the file has been deleted since
 the last full backup and then the incremental backup is taken. In this case,
 that deleted file exists only in the full backup. We can detect the deletion 
 of
 the file by checking both full and incremental backups.
 

When you take an incremental backup, only changed files are sent. Without the 
backup_profile in the incremental backup, you cannot detect a deleted file, 
because it's indistinguishable from a file that is not changed.

 We've really gotten the consensus about the current design, especially that
 every files basically need to be read to check whether they have been 
 modified
 since last backup even when *no* modification happens since last backup?

 The real problem here is that there is currently no way to detect that a 
 file is not changed since the last backup. We agreed to not use file system 
 timestamps as they are not reliable for that purpose.
 
 TBH I prefer timestamp-based approach in the first version of incremental 
 backup
 even if's less reliable than LSN-based one. I think that some users who are
 using timestamp-based rsync (i.e., default mode) for the backup would be
 satisfied with timestamp-based one.

The original design was to compare size+timestamp+checksums (only if everything 
else matches and the file has been modified after the start of the backup), but 
the feedback from the list was that we cannot trust the filesystem mtime and we 
must use LSN instead.

 
 Using LSN have a significant advantage over using checksum, as we can start 
 the full copy as soon as we found a block whith a LSN greater than the 
 threshold.
 There are two cases: 1) the file is changed, so we can assume that we detect 
 it after reading 50% of the file, then we send it taking advantage of file 
 system cache; 2) the file is not changed, so we read it without sending 
 anything.
 It will end up producing an I/O comparable to a normal backup.
 
 Yeah, it might make the situation better than today. But I'm afraid that
 many users might get disappointed about that behavior of an incremental
 backup after the release...

I don't get what do you mean here. Can you elaborate this point?

Regards,
Marco

-- 
Marco Nenciarini - 

Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote:
 * Bruce Momjian (br...@momjian.us) wrote:
  On Wed, Mar  4, 2015 at 11:36:23AM -0500, Stephen Frost wrote:
   * Andres Freund (and...@2ndquadrant.com) wrote:
On 2015-03-04 11:06:33 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  On 2015-03-04 10:52:30 -0500, Stephen Frost wrote:
   The first is a don't break anything approach which would move 
   the
   needle between network data sensitivity and on-disk data 
   sensitivity
   a bit back in the direction of making the network data more 
   sensitive.
  
  I think that's a really bad tradeoff for pg. There's pretty good 
  reasons
  not to encrypt database connections. I don't think you really can
  compare routinely encrypted stuff like imap and submission with
  pg. Neither is it as harmful to end up with leaked hashes for 
  database
  users as it is for a email provider's authentication database.
 
 I'm confused..  The paragraph you reply to here discusses an approach
 which doesn't include encrypting the database connection.

An increase in network data sensitivity also increases the need for
encryption.
   
   Ok, I see what you're getting at there, though our existing md5
   implementation with no lock-out mechanism or ability to deal with
   hijacking isn't exactly making us all that safe when it comes to network
   based attacks.  The best part about md5 is that we don't send the user's
   password over the wire in the clear, the actual challenge/response piece
   - here is where I was lost
   is not considered terribly secure today, nor is the salt+password we use
   for pg_authid for that matter. :/
  
  Can you please rephrase the last sentence as it doesn't make sense to
  me?
 
 The best part of the existing authentication method we call md5 is
 that the user's password is never sent over the network in the clear.
 
 The challenge/response implementation we have only provides for 4 bytes
 of hash (or around four billion possible permutations) which is not very
 secure today (as compared to the 16-character base64 salt used in SCRAM,
 which is 16^64 or 2^96 instead of 2^32).

Err, 64^16 or 2^96, that is.  16^64 is not the same and would be way
larger. :)

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 12:43:54PM -0500, Stephen Frost wrote:
  What does storing multiple hash(password || stoarage_salt) values do for
  us that session_salt doesn't already do?
 
 By storing a hash of the result of the challenge/response, we wouldn't
 be susceptible to attacks where the user has gained access to the
 contents of pg_authid because the values there would not be (directly)
 useful for authentication.  Today, an attacker can take what's in
 pg_authid and directly use it to authenticate (which is what the
 interwebs are complaining about).
 
 We wouldn't want to do that for just a single value though because then
 there wouldn't be any value to the challenge/response system (which is
 intended to prevent replay attacks where the attacker has sniffed a
 value from the network and then uses that value to authenticate
 themselves).

So you are storing the password + storage-salt + session-saltX, where X
is greater than the maximum number of login attempts?  How do you know
the attacker will not be given a salt that was already seen before?

 The only way we can keep the session salt random without breaking the
 wireline protocol is to keep the raw data necessary for authentication
 in pg_authid (as we do now) since we'd need that information to recreate
 the results of the random session salt+user-hash for comparison.
 
 This is essentially a middle ground which maintains the existing
 wireline protocol while changing what is in pg_authid to be something
 that an attacker can't trivially use to authenticate.  It is not a

I don't understand how this works.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Normalizing units for postgresql.conf WAS: Patch: raise default for max_wal_segments to 1GB

2015-03-04 Thread Robert Haas
On Tue, Mar 3, 2015 at 1:42 PM, Josh Berkus j...@agliodbs.com wrote:
 Sure.  Although, do we take (s) for tcp_keepalives_idle?  Or only an INT?

 It's a time unit, so you can say 10s or 1ms. If you don't
 specify a unit, it implies seconds.

 So if we're going to make this consistent, let's make it consistent.

 1. All GUCs which accept time/size units will have them on the default
 setting.

+1.

 2. Time/size comments will be removed, *except* from GUCs which do not
 accept (ms/s/min) or (kB/MB/GB).

+1.

 Argument Against: will create unnecessary diff changes between 9.4's
 pg.conf and 9.5's pg.conf.

I don't care about that.  I don't like to change postgresql.conf in
minor releases unless we have important reasons for doing so, but
changing it in major releases seems fine.

I do like Greg Stark's suggestion of also warning about unitless settings.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Heikki Linnakangas (hlinn...@iki.fi) wrote:
 I'm not sure how expensive a brute force attack on SRP would be,
 using a stolen backup tape. There doesn't seem to be an iterations
 count similar to SCRAM. But note that SRP's resistance to
 brute-forcing the authentication handshake is of a different kind.
 It's not just expensive, but outright impossible. (Don't ask me how
 that works; I'm not well-versed in the maths involved.) That's a big
 advantage because it means that it's OK to use a fairly weak
 password like 'foobar123' that would be trivially cracked with a
 dictionary attack.

If it's actually impossible then that's certainly interesting..  I don't
get how that's possible, but ok.

 (You can still connect to the server and try
 different passwords, but the server can log that and throttle how
 many guesses / minute it let's you do)

Wouldn't that be nice...  Wish we did it. :(

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stefan Kaltenbrunner
On 03/04/2015 04:52 PM, Stephen Frost wrote:
 Bruce, all,
 
 I've been discussing this with a few folks outside of the PG community
 (Debian and Openwall people specifically) and a few interesting ideas
 have come out of that which might be useful to discuss.
 
 The first is a don't break anything approach which would move the
 needle between network data sensitivity and on-disk data sensitivity
 a bit back in the direction of making the network data more sensitive.
 
 this approach looks like this: pre-determine and store the values (on a
 per-user basis, so a new field in pg_authid or some hack on the existing
 field) which will be sent to the client in the AuthenticationMD5Password
 message.  Further, calculate a random salt to be used when storing data
 in pg_authid.  Then, for however many variations we feel are necessary,
 calculate and store, for each AuthenticationMD5Password value:
 
 md5_challenge, hash(salt || response)
 
 We wouldn't store 4 billion of these, of course, which means that the
 challenge / response system becomes less effective on a per-user basis.
 We could, however, store X number of these and provide a lock-out
 mechanism (something users have asked after for a long time..) which
 would make it likely that the account would be locked before the
 attacker was able to gain access.  Further, an attacker with access to
 the backend still wouldn't see the user's cleartext password, nor would
 we store the cleartext password or a token in pg_authid which could be
 directly used for authentication, and we don't break the wireline
 protocol or existing installations (since we could detect that the
 pg_authid entry has the old-style and simply 'upgrade' it).
 
 That's probably the extent of what we could do to improve the current
 'md5' approach without breaking the wireline protocol or existing stored
 data.
 
 A lot of discussion has been going on with SCRAM and SASL, which is all
 great, but that means we end up with a dependency on SASL or we have to
 reimplement SCRAM (which I've been thinking might not be a bad idea-
 it's actually not that hard), but another suggestion was made which may
 be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
 RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
 have OpenSSL and therefore this wouldn't create any new dependencies and
 might be slightly simpler to implement.

not sure we should depend on TLS-SRP - the libressl people removed the
support for SRP pretty early in the development process:
https://github.com/libressl/libressl/commit/f089354ca79035afce9ec649f54c18711a950ecd



Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

2015-03-04 Thread Peter Geoghegan
On Wed, Mar 4, 2015 at 8:26 AM, Robert Haas robertmh...@gmail.com wrote:
 I think we should commit my patch, and if a future patch needs
 sortKeys set in more places, it can make that change itself.  There's
 no reason why it's needed with the code as it is today, and no reason
 to let bits of future changes leak into a bug-fix patch.

It's not as if I feel strongly about it. It's trivial to change the
code to make the comment true, rather than change the comment to make
the code true, and we need to do so anyway. I defer to you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Heikki Linnakangas (hlinn...@iki.fi) wrote:
 The big difference between SRP and SCRAM is that if you eavesdrop
 the SCRAM handshake, you can use that information to launch a
 brute-force or dictionary attack. With SRP, you cannot do that. That
 makes it relatively safe to use weak passwords with SRP, which is
 not the case with SCRAM (nor MD5)

Thanks for the info!

Looking around a bit, one issue with SRP (as pointed out by Simon
Josefsson, the author of the SCRAM implementation for GNU SASL) is that
the username is included in the verifier (similar to our implementation
today with MD5) meaning that the stored data on the server is no longer
valid if the username is changed.  Obviously, our users are used to
that, but it's still something to be considered.

One question though- isn't the iteration option to SCRAM intended to
address the dictionary/brute force risk?  SRP uses an exponentiation
instead of iterations but it's unclear to me if one is really strictly
better or worse than the other (nor have I found any discussion of that
comparison) for this vector.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 01:27:32PM -0500, Stephen Frost wrote:
  This further makes what is sent over the network not directly
  susceptible to a replay attack because the server has multiple values
  available to pick for the salt to use and sends one at random to the
  client, exactly how our current challenge/response replay-prevention
  system works.  The downside is that the number of possible values for
  the server to send to prevent replay attacke is reduced from 2^32 to N.
 
 OK, I understand now --- by not using a random session salt, you can
 store a post-hash of what you receive from the client, preventing the
 pg_authid from being resent by a client.  Nice trick, though going from
 2^32 to N randomness doesn't seem like a win.

You're only looking at it from the network attack vector angle where
clearly that's a reduction in strength.  That is not the only angle and
in many environments the network attack vector is already addressed with
TLS.

From the perspective of what everyone is currently complaining about on
the web, which is the pg_authid compromise vector, it'd be a huge
improvement over the current situation and we wouldn't be breaking any
existing clients, nor does it require having the postmaster see the
user's cleartext password during authentication (which is a common
argument against recommending the 'password' authentication method).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Alvaro Herrera
Andrew Dunstan wrote:
 
 On 03/04/2015 09:51 AM, Robert Haas wrote:
 On Wed, Mar 4, 2015 at 9:06 AM, Peter Eisentraut pete...@gmx.net wrote:
 and make it harder to compare entries by grepping out some common
 substring.
 Could you give an example of the sort of thing you wish to do?
 e.g. grep for a function name and check that all the matches have the
 same volatility.
 
 I think grep will be the wrong tool for this format, but if we're settling
 on a perl format, a few perl one-liners should be able to work pretty well.
 It might be worth shipping a small perl module that provided some functions,
 or a script doing common tasks (or both).

I was going to say the same thing.  We need to make sure that the output
format of those oneliners is consistent, though -- it wouldn't be nice
if adding one column with nondefault value to a dozen of entries changes
the formatting of other entries.  For example, perhaps declare that the
order of entries is alphabetical or it matches something declared at the
start of the file.

From that POV, I don't like the idea of having multiple columns for a
sigle entry in a single line; adding more columns means that eventually
we're going to split lines that have become too long in a different
place, which would reformat the whole file; not very nice.  But maybe
this doesn't matter if we decree that changing the column split is a
manual chore rather than automatic, because then it can be done in a
separate mechanical commit after the extra column is added.

BTW one solution to the merge problem is to have unique separators for
each entry.  For instance, instead of

}   -- this is the end of the previous entry
,
{ 
oid = 2233,
proname = array_append,

we could have
} # array_prepend 2232
,
} # array_append 2233
oid = 2233,
proname = array_append,

where the funcname-oid comment is there to avoid busted merges.  The
automatic editing tools make sure that those markers are always present.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 02:21:51PM -0500, Stephen Frost wrote:
  * Bruce Momjian (br...@momjian.us) wrote:
   On Wed, Mar  4, 2015 at 01:27:32PM -0500, Stephen Frost wrote:
This further makes what is sent over the network not directly
susceptible to a replay attack because the server has multiple values
available to pick for the salt to use and sends one at random to the
client, exactly how our current challenge/response replay-prevention
system works.  The downside is that the number of possible values for
the server to send to prevent replay attacke is reduced from 2^32 to N.
   
   OK, I understand now --- by not using a random session salt, you can
   store a post-hash of what you receive from the client, preventing the
   pg_authid from being resent by a client.  Nice trick, though going from
   2^32 to N randomness doesn't seem like a win.
  
  You're only looking at it from the network attack vector angle where
  clearly that's a reduction in strength.  That is not the only angle and
  in many environments the network attack vector is already addressed with
  TLS.
 
 Well, passwords are already addressed by certificate authentication, so
 what's your point?  I think we decided we wanted a way to do passwords
 without requiring network encryption.

It's completely unclear to me what you mean by passwords are already
addressed by certificate authentication.  Password-based and
certificate-based authentication are two independent mechanisms
available today, and both can be used with TLS.  Certainly the more
common implementation that I've come across is password-based
authentication through the md5 mechanism with TLS.  There are few places
which actually use client-side certificate-based authentication but tons
which use TLS.

  From the perspective of what everyone is currently complaining about on
  the web, which is the pg_authid compromise vector, it'd be a huge
  improvement over the current situation and we wouldn't be breaking any
  existing clients, nor does it require having the postmaster see the
  user's cleartext password during authentication (which is a common
  argument against recommending the 'password' authentication method).
 
 We are not designing only for what people are complaining about today.

I was simply trying to explain what the 'newly discovered' vector
being discussed today is.  I complained about our implementation here 10
years ago and it has come up before this recent wave of complaints since
then, though perhaps with a bit less publicity, but I suspect that's
just because PG is getting popular.

And, no, I'm not suggesting that we design differently because we're now
popular, but I do think we need to address this issue because it's very
clearly an obvious deficiency.  I trust you feel the same as you started
this thread.

I brought up this approach because it avoids breaking the wireline
protocol or forcing everyone to switch to a new authentication method.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 12:43:54PM -0500, Stephen Frost wrote:
   What does storing multiple hash(password || stoarage_salt) values do for
   us that session_salt doesn't already do?
  
  By storing a hash of the result of the challenge/response, we wouldn't
  be susceptible to attacks where the user has gained access to the
  contents of pg_authid because the values there would not be (directly)
  useful for authentication.  Today, an attacker can take what's in
  pg_authid and directly use it to authenticate (which is what the
  interwebs are complaining about).
  
  We wouldn't want to do that for just a single value though because then
  there wouldn't be any value to the challenge/response system (which is
  intended to prevent replay attacks where the attacker has sniffed a
  value from the network and then uses that value to authenticate
  themselves).
 
 So you are storing the password + storage-salt + session-saltX, where X
 is greater than the maximum number of login attempts?  How do you know
 the attacker will not be given a salt that was already seen before?

The password isn't stored and I don't know what you're talking about
regarding the number of login attempts.  Further, we don't know today if
an attacker has seen a particular challege/response sequence or not (nor
can we...) and we can certainly repeat it.

  The only way we can keep the session salt random without breaking the
  wireline protocol is to keep the raw data necessary for authentication
  in pg_authid (as we do now) since we'd need that information to recreate
  the results of the random session salt+user-hash for comparison.
  
  This is essentially a middle ground which maintains the existing
  wireline protocol while changing what is in pg_authid to be something
  that an attacker can't trivially use to authenticate.  It is not a
 
 I don't understand how this works.

Ok, let me try to explain it another way.

The current system looks like this:

client has username and password
server has hash(username + password)

client:

  send auth request w/ username and database

server:

  send random salt to client

client:

  send hash(hash(username + password) + salt) to server

server:

  calculate hash(hash(username + password) + salt)
  compare to what client sent

What the interwebs are complaining about is that the 
hash(username + password) piece that's stored in pg_authid is
sufficient to authenticate.

Here's what was proposed as an alternative which would prevent that
without breaking the existing wireline protocol:

client has username and password
server has user_salt,
   N *
 {salt, hash(hash(hash(username + password) + salt), user_salt)}

client:

  send auth request w/ username and database

server:

  picks random salt from the salts available for this user
  sends salt to the user

client:

  send hash(hash(username + password) + salt) to server

server:

  server calculates, using the data from the client,
hash(FROM_CLIENT + user_salt)
  compares to hash stored for salt chosen

This makes what is stored in pg_authid no longer directly useful for
authentication (similar to how Unix passwd-based authentication works)
because if the client sent any of those values, we'd add the user_salt
and hash it again and it wouldn't match what's stored.

This further makes what is sent over the network not directly
susceptible to a replay attack because the server has multiple values
available to pick for the salt to use and sends one at random to the
client, exactly how our current challenge/response replay-prevention
system works.  The downside is that the number of possible values for
the server to send to prevent replay attacke is reduced from 2^32 to N.

To mitigate the replay risk we would, ideally, support a lock-out
mechanism.  This won't help if the attacker has extended network access
though as we would almost certainly eventually go through all N
permutations for this user.

However, use of TLS would prevent the network-based attack vector.

Using TLS + the 'password' authentication mechanism would also achieve
the goal of making what is in pg_authid not directly useful for
authentication, but that would mean that the postmaster would see the
user's password (post-decryption), leading to a risk that the password
could be reused for access to other systems by a rogue server
administrator.  Now, that's what SSH and *most* systems have had for
ages when it comes to password-based authentication and therefore it's
well understood in the industry and session-level encryption is
generally considered to avoid the network-based vector associated with
that approach.

For PG users, however, we encourage using md5 instead of password as we
consider that better, but users familiar with SSH and other
unix-password based authentication mechanisms might, understandably,
think that MD5+TLS prevents both attack vectors, but it doesn't.
Password+TLS is what they would 

Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Heikki Linnakangas

On 03/04/2015 06:11 PM, Stephen Frost wrote:

* Magnus Hagander (mag...@hagander.net) wrote:

On Wed, Mar 4, 2015 at 5:03 PM, Stephen Frost sfr...@snowman.net wrote:

No, I'm not suggesting that OpenSSL or TLS become mandatory but was
thinking it might be good alternative as a middle-ground between full
client-and-server side certificates and straight password-based auth
(which is clearly why it was invented in the first place) and so, yes,
md5 would still have to be kept around, but we'd at least be able to
deprecate it and tell people Use TLS-SRP if you really want to useou
passwords and care about network security.

SCRAM doesn't actually fix the issue with network connection hijacking
or eavesdropping, except to the extent that it protects the password
itself, and so we might want to recommend, for people who are worried
about network-based attacks, using TLS-SRP.


Assuming we do implement SCRAM, what does TLS-SRP give us that we wouldn't
get by just using SCRAM over a TLS connection?


Good question and I'll have to dig more into that.  SCRAM does appear to
support channel binding with TLS and therefore there might not be much
to be gained from having both.


The big difference between SRP and SCRAM is that if you eavesdrop the 
SCRAM handshake, you can use that information to launch a brute-force or 
dictionary attack. With SRP, you cannot do that. That makes it 
relatively safe to use weak passwords with SRP, which is not the case 
with SCRAM (nor MD5)


Let me list the possible attacks that we're trying to protect against:

A) Eve eavesdrops on the authentication exchange. Can she use the 
information gathered directly to authenticate to the server?


B) Can Eve use the information to launch a dictionary or brute force the 
password?


C) Can a malicious server impersonate the real server? (i.e. does the 
protocol not authenticate the server to the client)


D) If Eve obtains a copy pg_authid (e.g from a backup tape), can she use 
that information to authenticate directly? (Brute forcing the passwords 
is always possible in this case)


A)  B)  C)  D)
passwordYes Yes Yes No [1]
MD5 No  Yes Yes Yes
SCRAM   No  Yes No  No
SRP No  No  No  No

[1] assuming that pg_authid stored MD5 hashes, not plaintext passwords, 
which should be the case these days.


Note that this table does not consider how difficult a brute-force 
attack is in each case; MD5 is a lot cheaper to calculate than SCRAM or 
SRP hashes. And there are more things to consider like implementation 
effort, strength of the underlying hash and other algorithms etc.


Also, attacks A), B) and C) can be thwarted by using SSL, with the 
client configured to check the server certificate (sslmode=verify-full). 
So actually, password authentication with SSL is not a bad option at 
all; it's actually better than MD5 because it doesn't allow attack D).


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-04 Thread Kevin Grittner
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote:

 The performance which your test shows looks great. The gain might
 comes from removing of buffer locking on _bt_next.

Yeah, I had been hoping that removing some buffer pins and locks
from the common path of scanning forward from one page to the next
might have some direct performance benefit; it's possible that this
will show up more dramatically at high concurrency levels.  The
main goal, of course, was to improve performance more indirectly,
by not having autovacuum worker processes blocked for extended
periods in some situations where that can now happen.

 In nbtutils.c, _bt_killitems:

 - This is not in charge of this patch, but the comment for
 _bt_killitems looks to have a trivial typo.

As you say, that problem was already there, but no reason not to
fix it in the patch as long as we're here.  Reworded.

 - The following comment looks somewhat wrong.

  /* Since the else condition needs page, get it here, too. */

 the else condition needs page means the page of the buffer
 is needed later? But, I suppose the comment itself is not
 necessary.

I guess that comment isn't really worth the space it takes up; what
it says is pretty obvious from the code.  Removed.

 - If (BTScanPosIsPinned(so-currPos)).

 As I mention below for nbtsearch.c, the page aquired in the
 if-then block may be vacuumed so the LSN check done in the
 if-else block is need also in the if-then block. It will be
 accomplished only by changing the position of the end of the
 if-else block.

I'm not sure I agree on this.  If the page is pinned it should have
been pinned continuously since we initially read it, so the line
pointers we have could not have been removed by any vacuum process.
The tuples may have been pruned away in the heap, but that doesn't
matter.  Index entries may have been added and the index page may
have been split, but that was true before this patch and
_bt_killitems will deal with those things the same as it always
has.  I don't see any benefit to doing the LSN compare in this
case; if we've paid the costs of holding the pin through to this
point, we might as well flag any dead entries we can.

 - _bt_killitems is called without pin when rescanning from
 before, so I think the previous code has already considered the
 unpinned case (if (ItemPointerEquals(ituple... does
 that). Vacuum rearranges line pointers after deleting LP_DEAD
 items so the previous check seems enough for the purpose. The
 previous code is more effeicient becuase the mathing items will
 be processed even after vacuum.

I'm not following you on this one; could you rephrase it?

 In nbtsearch.c

 - _bt_first(): It now releases the share lock on the page before
 the items to be killed is processed. This allows other backends
 to insert items into the page simultaneously. It seems to be
 dangerous alone, but _bt_killitems already considers the
 case. Anyway I think It'll be better to add a comment
 mentioning unlocking is not dangerous.

Well, _bt_killitems does acquire a shared (read) lock to flag index
tuples as LP_DEAD, and it has always been OK for the index page to
accept inserts and allow page splits before calling _bt_killitems.
I don't really see anything new with the patch in this regard.  I
do agree, though, that a lot of comments and at least two README
files refer to the pins blocking vacuum, and these must be found
and changed.

It doesn't seem worth posting to the list for the small changes
since the last version; I'll wait until I update the comments and
README files.  If you want review or test the latest, you can peek
at: https://github.com/kgrittn/postgres/tree/btnopin

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 01:27:32PM -0500, Stephen Frost wrote:
 This further makes what is sent over the network not directly
 susceptible to a replay attack because the server has multiple values
 available to pick for the salt to use and sends one at random to the
 client, exactly how our current challenge/response replay-prevention
 system works.  The downside is that the number of possible values for
 the server to send to prevent replay attacke is reduced from 2^32 to N.

OK, I understand now --- by not using a random session salt, you can
store a post-hash of what you receive from the client, preventing the
pg_authid from being resent by a client.  Nice trick, though going from
2^32 to N randomness doesn't seem like a win.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 02:21:51PM -0500, Stephen Frost wrote:
 * Bruce Momjian (br...@momjian.us) wrote:
  On Wed, Mar  4, 2015 at 01:27:32PM -0500, Stephen Frost wrote:
   This further makes what is sent over the network not directly
   susceptible to a replay attack because the server has multiple values
   available to pick for the salt to use and sends one at random to the
   client, exactly how our current challenge/response replay-prevention
   system works.  The downside is that the number of possible values for
   the server to send to prevent replay attacke is reduced from 2^32 to N.
  
  OK, I understand now --- by not using a random session salt, you can
  store a post-hash of what you receive from the client, preventing the
  pg_authid from being resent by a client.  Nice trick, though going from
  2^32 to N randomness doesn't seem like a win.
 
 You're only looking at it from the network attack vector angle where
 clearly that's a reduction in strength.  That is not the only angle and
 in many environments the network attack vector is already addressed with
 TLS.

Well, passwords are already addressed by certificate authentication, so
what's your point?  I think we decided we wanted a way to do passwords
without requiring network encryption.

 From the perspective of what everyone is currently complaining about on
 the web, which is the pg_authid compromise vector, it'd be a huge
 improvement over the current situation and we wouldn't be breaking any
 existing clients, nor does it require having the postmaster see the
 user's cleartext password during authentication (which is a common
 argument against recommending the 'password' authentication method).

We are not designing only for what people are complaining about today.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Robert Haas
On Wed, Mar 4, 2015 at 2:27 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Andrew Dunstan wrote:
 On 03/04/2015 09:51 AM, Robert Haas wrote:
 On Wed, Mar 4, 2015 at 9:06 AM, Peter Eisentraut pete...@gmx.net wrote:
 and make it harder to compare entries by grepping out some common
 substring.
 Could you give an example of the sort of thing you wish to do?
 e.g. grep for a function name and check that all the matches have the
 same volatility.

 I think grep will be the wrong tool for this format, but if we're settling
 on a perl format, a few perl one-liners should be able to work pretty well.
 It might be worth shipping a small perl module that provided some functions,
 or a script doing common tasks (or both).

 I was going to say the same thing.  We need to make sure that the output
 format of those oneliners is consistent, though -- it wouldn't be nice
 if adding one column with nondefault value to a dozen of entries changes
 the formatting of other entries.  For example, perhaps declare that the
 order of entries is alphabetical or it matches something declared at the
 start of the file.

 From that POV, I don't like the idea of having multiple columns for a
 sigle entry in a single line; adding more columns means that eventually
 we're going to split lines that have become too long in a different
 place, which would reformat the whole file; not very nice.  But maybe
 this doesn't matter if we decree that changing the column split is a
 manual chore rather than automatic, because then it can be done in a
 separate mechanical commit after the extra column is added.

 BTW one solution to the merge problem is to have unique separators for
 each entry.  For instance, instead of

 }   -- this is the end of the previous entry
 ,
 {
 oid = 2233,
 proname = array_append,

 we could have
 } # array_prepend 2232
 ,
 } # array_append 2233
 oid = 2233,
 proname = array_append,

 where the funcname-oid comment is there to avoid busted merges.  The
 automatic editing tools make sure that those markers are always present.

Speaking from entirely too much experience, that's not nearly enough.
git only needs 3 lines of context to apply a hunk with no qualms at
all, and it'll shade that to just 1 or 2 with little fanfare.  If your
pg_proc entries are each 20 lines long, this sort of thing will
provide little protection.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2015-03-04 Thread Peter Eisentraut
On 3/3/15 7:17 PM, Jim Nasby wrote:
 I think we're screwed in that regard anyway, because of the special
 constructs. You'd need different logic to handle things like +role and
 sameuser. We might even end up painted in a corner where we can't change
 it in the future because it'll break everyone's scripts.

Yeah, I'm getting worried about this.  I think most people agree that
getting a peek at pg_hba.conf from within the server is useful, but
everyone seems to have quite different uses for it.  Greg wants to join
against other catalog tables, Jim wants to reassemble a valid and
accurate pg_hba.conf, Josh wants to write an editing tool.  Personally,
I'd like to see something as close to the actual file as possible.

If there were an obviously correct way to map the various special
constructs to the available SQL data types, then fine.  But if there
isn't, then we shouldn't give a false overinterpretation.  So I'd render
everything that's disputed as a plain text field.  (Not even an array of
text.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-03-04 Thread Peter Eisentraut
On 3/3/15 11:11 AM, Bruce Momjian wrote:
 On Tue, Mar  3, 2015 at 10:58:28AM -0500, Bruce Momjian wrote:
 Would you suggest removing the automated system completely, or keep it 
 around
 and just make it possible to override it (either by removing the note that
 something is a patch, or by making something that's not listed as a patch
 become marked as such)?

 One counter-idea would be to assume every attachment is a patch _unless_
 the attachment type matches a pattern that identifies it as not a patch.

 However, I agree with Tom that we should go a little longer before
 changing it.
 
 Also, can we look inside the attachment to see if it starts with
 'diffspace'.

There are two different issues here:  One, how you detect a patch.  Two,
whether the latest patch is the patch of record for the commit fest item.

I think they are both currently wrong, but the second one is more
important.  Of course, if you punt on the first one, you have to do the
second one differently.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Wed, Mar  4, 2015 at 02:46:54PM -0500, Stephen Frost wrote:
   Well, passwords are already addressed by certificate authentication, so
   what's your point?  I think we decided we wanted a way to do passwords
   without requiring network encryption.
  
  It's completely unclear to me what you mean by passwords are already
  addressed by certificate authentication.  Password-based and
  certificate-based authentication are two independent mechanisms
  available today, and both can be used with TLS.  Certainly the more
  common implementation that I've come across is password-based
  authentication through the md5 mechanism with TLS.  There are few places
  which actually use client-side certificate-based authentication but tons
  which use TLS.
 
 My point is we can't design a new authentication method to replace MD5
 if it doesn't work well without TLS.

This isn't a discussion about designing a new authentication method to
replace MD5.  I'm not interested in doing that- we should be using one
which has already been designed by experts in that field (just like we'd
prefer they use PG instead of writing their own relational databases).

This is about trying to make the current situation around MD5 better
without breaking existing clients.  This proposal, in my view and at
least that of some others, does that.  Does it make the md5
authentication method secure?  No, but nothing is going to do that
except deprecating it entirely.

Further, to the extent that we can make *common use-cases* of md5-based
authentication better, that's a good thing.

  I was simply trying to explain what the 'newly discovered' vector
  being discussed today is.  I complained about our implementation here 10
  years ago and it has come up before this recent wave of complaints since
  then, though perhaps with a bit less publicity, but I suspect that's
  just because PG is getting popular.
  
  And, no, I'm not suggesting that we design differently because we're now
  popular, but I do think we need to address this issue because it's very
  clearly an obvious deficiency.  I trust you feel the same as you started
  this thread.
  
  I brought up this approach because it avoids breaking the wireline
  protocol or forcing everyone to switch to a new authentication method.
 
 Let me update my list of possible improvements:
 
 1)  MD5 makes users feel uneasy (though our usage is mostly safe)

Saying it's mostly safe is questionable, at best.

 2)  The per-session salt sent to the client is only 32-bits, meaning
 that it is possible to reply an observed MD5 hash in ~16k connection
 attempts.

Yes, and we have no (PG-based) mechanism to prevent those connection
attempts, which is a pretty horrible situation to be in.

 3)  Using the user name for the MD5 storage salt allows the MD5 stored
 hash to be used on a different cluster if the user used the same
 password.
 
 4)  Using the user name for the MD5 storage salt allows the MD5 stored
 hash to be used on the _same_ cluster.
 
 5)  Using the user name for the MD5 storage salt causes the renaming of
 a user to break the stored password.
 
 I see your idea of hashing what the client sends to the server is to
 currect #4?  And #2 becomes more of a problem?  And my idea of using a
 random storage salt and longer session salt fix numbers 2 and 3, but not
 4?

The proposal I've been discussing for improving md5 while not breaking
the existing wireline protocol would address 3 and 4 in your list while
making #2 require fewer connection attempts by an attacker (but then
again, with either approach, it's not like it takes a long time on
today's systems to reach the requisite number of attempts to get a
successful login).

I've not seen a detailed explanation of what your proposal is and so I'm
not sure that I can really comment on it since I don't really know what
it is.  The comment about using a random storage salt doesn't make any
sense since we can't have both a session salt *and* a storage salt, and
the comment about a longer session salt implies a break in the
wireline protocol, which goes against the premise of this discussion.

If we're going to break the wireline protocol then we need to go to
SCRAM or some existing, well-defined authentication system and not
something home-grown.

I'd suggest reviewing Heikki's list of attack vectors and the matrix he
built.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Bruce Momjian br...@momjian.us writes:
  Let me update my list of possible improvements:
 
  1)  MD5 makes users feel uneasy (though our usage is mostly safe)
 
  2)  The per-session salt sent to the client is only 32-bits, meaning
  that it is possible to reply an observed MD5 hash in ~16k connection
  attempts.
 
  3)  Using the user name for the MD5 storage salt allows the MD5 stored
  hash to be used on a different cluster if the user used the same
  password.
 
  4)  Using the user name for the MD5 storage salt allows the MD5 stored
  hash to be used on the _same_ cluster.
 
  5)  Using the user name for the MD5 storage salt causes the renaming of
  a user to break the stored password.
 
 What happened to possession of the contents of pg_authid is sufficient
 to log in?  I thought fixing that was one of the objectives here.

Yes, it certainly was.  I think Bruce was thinking that we could simply
hash what goes on to disk with an additional salt that's stored, but
that wouldn't actually work without requiring a change to the wireline
protocol, which is the basis of this entire line of discussion, in my
view.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Peter Eisentraut
On 3/4/15 12:20 PM, Marko Tiikkaja wrote:
 On 3/4/15 6:19 PM, I wrote:
 On 3/4/15 5:26 PM, Tom Lane wrote:
 It was considered to be a bug fix; more, given the few complaints about
 the clearly-broken old behavior, we thought it was a fix that would
 affect
 few people, and them positively.

 Yeah, but these things usually go the other way.  This has been broken
 for 10 years but nobody noticed, so we're not going to fix this
 
 Sorry, that should have said we're not going to fix this *in the back
 branches*.

I tend to agree.  I was not in favor of backpatching, but other people
were in favor of it, and no one spoke up against it.

That said, if I understand your test case correctly, this would
basically be an argument against any semantic corrections in stable
releases, since user code could expect to continue to work with the
previous incorrect value.

You can always test the server version number, and you'll have to for
the next major release anyway.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Tom Lane (t...@sss.pgh.pa.us) wrote:
  What happened to possession of the contents of pg_authid is sufficient
  to log in?  I thought fixing that was one of the objectives here.
 
  Yes, it certainly was.  I think Bruce was thinking that we could simply
  hash what goes on to disk with an additional salt that's stored, but
  that wouldn't actually work without requiring a change to the wireline
  protocol, which is the basis of this entire line of discussion, in my
  view.
 
 Hm, well, don't change the wireline protocol could be another wanna-have
 ... but if we want to stop using MD5, that's not really a realistic goal
 is it?

I'm trying to address both sides of the issue- improve the current
situation without breaking existing clients AND provide a new auth
method and encourage everyone to move to it as soon as they're able.  We
can't simply deprecate md5 as that would break pg_upgrade for any users
who are currently using it.

This half of the discussion has been all about improving md5 without
breaking the wireline protocol or existing users.

The other half of the discussion is about implementing a new
password-based authentication based on one of the vetted authentication
protocols already in use today (SCRAM or SRP, for example).  Using those
new authentication protocols would include a move off of the MD5 hashing
function, of course.  This would also mean breaking the on-disk hash,
but that's necessary anyway because what we do there today isn't secure
either and no amount of futzing is going to change that.

I've got nearly zero interest in trying to go half-way on this by
designing something that we think is secure which has had no external
review or anyone else who uses it.  Further, going that route makes me
very nervous that we'd decide on certain compromises in order to make
things easier for users without actually realising the problems with
such an approach (eg: well, if we use hack X we wouldn't have to
change what is stored on disk and therefore we wouldn't break
pg_upgrade..).  I'd *much* rather have a clean break and migration
path for users.  If we had a password rotation capability then this kind
of a transistion would be a lot easier on our users and I'd definitely
recommend that we add that with this new authentication mechanism, to
address this kind of issue in the future (not to mention that's often
asked for..).  Password complexity would be another thing we should
really add and is also often requested.

Frankly, in the end, I don't see us being able to produce something in
time for this release unless someone can be dedicated to the effort over
the next couple of months, and therefore I'd prefer to improve the
current issues with md5 without breaking the wireline protocol than
simply do nothing (again).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Robert Haas
On Wed, Mar 4, 2015 at 10:04 AM, Andrew Dunstan and...@dunslane.net wrote:
 Is it necessarily an all or nothing deal?

 Taking a previous example, we could have something like:

 {
  oid = 2249,  oiddefine = 'CSTRINGOID',  typname = 'cstring',
  typlen = -2, typbyval = 1,
  ...
 }

 which would allow us to fit within a reasonable edit window (for my normal
 window and font that's around 180 characters) and still reduce the number of
 lines.

 I'm not wedded to it, but it's a thought.

Another advantage of this is that it would probably make git less
likely to fumble a rebase.  If there are lots of places in the file
where we have the same 10 lines in a row with occasional variations,
rebasing a patch could easily pick the the wrong place to reapply the
hunk.  I would personally consider a substantial increase in the rate
of such occurrences as being a cure far, far worse than the disease.
If you keep the entry for each function on just a couple of lines the
chances of this happening are greatly reduced, because you're much
likely to get a false match to surrounding context.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c

2015-03-04 Thread Robert Haas
On Sun, Mar 1, 2015 at 6:58 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Hm, why not. That would remove all inconsistencies between the parser
 and the autovacuum code path. Perhaps something like the attached
 makes sense then?

I really don't see this patch, or any of the previous ones, as solving
any actual problem.  There's no bug here, and no reason to suspect
that future code changes would be particularly like to introduce one.
Assertions are a great way to help developers catch coding mistakes,
but it's a real stretch to think that a developer is going to add a
new syntax for ANALYZE that involves setting options proper to VACUUM
and not notice it.

This thread started out because Michael read an assertion in the code
and misunderstood what that assertion was trying to guard against.
I'm not sure there's any code change needed here at all, but if there
is, I suggest we confine it to adding a one-line comment above that
assertion clarifying its purpose, like /* check that parser didn't
produce ANALYZE FULL or ANALYZE FREEZE */.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
Magnus,

* Magnus Hagander (mag...@hagander.net) wrote:
 On Wed, Mar 4, 2015 at 4:52 PM, Stephen Frost sfr...@snowman.net wrote:
  A lot of discussion has been going on with SCRAM and SASL, which is all
  great, but that means we end up with a dependency on SASL or we have to
  reimplement SCRAM (which I've been thinking might not be a bad idea-
  it's actually not that hard), but another suggestion was made which may
 
 I'd really rather not add a dependency on SASL if we can avoid it. I
 haven't read up on SCRAM, but if it's reasonable enough to reimplement - or
 if there is a BSD licensed implementation that we can import into our own
 sourcetree without adding a dependency on SASL, that sounds like a good way
 to proceed.

I actually like the idea of supporting SASL generally, but I agree that
we don't really want to force it as a dependency.  I've started looking
around for BSD-licensed SCRAM implementations and will update with any I
find that are worthwhile to review.

  be worthwhile to consider- OpenSSL and GnuTLS both support TLS-SRP, the
  RFC for which is here: http://www.ietf.org/rfc/rfc5054.txt.  We already
  have OpenSSL and therefore this wouldn't create any new dependencies and
  might be slightly simpler to implement.
 
 OpenSSL is not a *requirement* today, it's an optional dependency.  Given
 it's license we really can't make it a mandatory requirement I think. So if
 we go down that route, we still leave md5 in there as the one that works
 everywhere.
 
 Also AFAICT TLS-SRP actually requires the connection to be over TLS - so
 are you suggesting that TLS becomes mandatory?
 
 It sounds like something that could be interesting to have, but not as a
 solution to the md5 problem, imo.

No, I'm not suggesting that OpenSSL or TLS become mandatory but was
thinking it might be good alternative as a middle-ground between full
client-and-server side certificates and straight password-based auth
(which is clearly why it was invented in the first place) and so, yes,
md5 would still have to be kept around, but we'd at least be able to
deprecate it and tell people Use TLS-SRP if you really want to use
passwords and care about network security.

SCRAM doesn't actually fix the issue with network connection hijacking
or eavesdropping, except to the extent that it protects the password
itself, and so we might want to recommend, for people who are worried
about network-based attacks, using TLS-SRP.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
 On Wed, Mar 4, 2015 at 5:03 PM, Stephen Frost sfr...@snowman.net wrote:
  No, I'm not suggesting that OpenSSL or TLS become mandatory but was
  thinking it might be good alternative as a middle-ground between full
  client-and-server side certificates and straight password-based auth
  (which is clearly why it was invented in the first place) and so, yes,
  md5 would still have to be kept around, but we'd at least be able to
  deprecate it and tell people Use TLS-SRP if you really want to use
  passwords and care about network security.
 
  SCRAM doesn't actually fix the issue with network connection hijacking
  or eavesdropping, except to the extent that it protects the password
  itself, and so we might want to recommend, for people who are worried
  about network-based attacks, using TLS-SRP.
 
 Assuming we do implement SCRAM, what does TLS-SRP give us that we wouldn't
 get by just using SCRAM over a TLS connection?

Good question and I'll have to dig more into that.  SCRAM does appear to
support channel binding with TLS and therefore there might not be much
to be gained from having both.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 02:46:54PM -0500, Stephen Frost wrote:
  Well, passwords are already addressed by certificate authentication, so
  what's your point?  I think we decided we wanted a way to do passwords
  without requiring network encryption.
 
 It's completely unclear to me what you mean by passwords are already
 addressed by certificate authentication.  Password-based and
 certificate-based authentication are two independent mechanisms
 available today, and both can be used with TLS.  Certainly the more
 common implementation that I've come across is password-based
 authentication through the md5 mechanism with TLS.  There are few places
 which actually use client-side certificate-based authentication but tons
 which use TLS.

My point is we can't design a new authentication method to replace MD5
if it doesn't work well without TLS.

   From the perspective of what everyone is currently complaining about on
   the web, which is the pg_authid compromise vector, it'd be a huge
   improvement over the current situation and we wouldn't be breaking any
   existing clients, nor does it require having the postmaster see the
   user's cleartext password during authentication (which is a common
   argument against recommending the 'password' authentication method).
  
  We are not designing only for what people are complaining about today.
 
 I was simply trying to explain what the 'newly discovered' vector
 being discussed today is.  I complained about our implementation here 10
 years ago and it has come up before this recent wave of complaints since
 then, though perhaps with a bit less publicity, but I suspect that's
 just because PG is getting popular.
 
 And, no, I'm not suggesting that we design differently because we're now
 popular, but I do think we need to address this issue because it's very
 clearly an obvious deficiency.  I trust you feel the same as you started
 this thread.
 
 I brought up this approach because it avoids breaking the wireline
 protocol or forcing everyone to switch to a new authentication method.

Let me update my list of possible improvements:

1)  MD5 makes users feel uneasy (though our usage is mostly safe)

2)  The per-session salt sent to the client is only 32-bits, meaning
that it is possible to reply an observed MD5 hash in ~16k connection
attempts.

3)  Using the user name for the MD5 storage salt allows the MD5 stored
hash to be used on a different cluster if the user used the same
password.

4)  Using the user name for the MD5 storage salt allows the MD5 stored
hash to be used on the _same_ cluster.

5)  Using the user name for the MD5 storage salt causes the renaming of
a user to break the stored password.

I see your idea of hashing what the client sends to the server is to
currect #4?  And #2 becomes more of a problem?  And my idea of using a
random storage salt and longer session salt fix numbers 2 and 3, but not
4?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bootstrap DATA is a pita

2015-03-04 Thread Peter Eisentraut
On 3/4/15 9:51 AM, Robert Haas wrote:
 On Wed, Mar 4, 2015 at 9:06 AM, Peter Eisentraut pete...@gmx.net wrote:
 and make it harder to compare entries by grepping out some common
 substring.

 Could you give an example of the sort of thing you wish to do?
 
 e.g. grep for a function name and check that all the matches have the
 same volatility.

You could still do that with grep -A or something like that.  I think
that it would be easier than now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Let me update my list of possible improvements:

 1)  MD5 makes users feel uneasy (though our usage is mostly safe)

 2)  The per-session salt sent to the client is only 32-bits, meaning
 that it is possible to reply an observed MD5 hash in ~16k connection
 attempts.

 3)  Using the user name for the MD5 storage salt allows the MD5 stored
 hash to be used on a different cluster if the user used the same
 password.

 4)  Using the user name for the MD5 storage salt allows the MD5 stored
 hash to be used on the _same_ cluster.

 5)  Using the user name for the MD5 storage salt causes the renaming of
 a user to break the stored password.

What happened to possession of the contents of pg_authid is sufficient
to log in?  I thought fixing that was one of the objectives here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 What happened to possession of the contents of pg_authid is sufficient
 to log in?  I thought fixing that was one of the objectives here.

 Yes, it certainly was.  I think Bruce was thinking that we could simply
 hash what goes on to disk with an additional salt that's stored, but
 that wouldn't actually work without requiring a change to the wireline
 protocol, which is the basis of this entire line of discussion, in my
 view.

Hm, well, don't change the wireline protocol could be another wanna-have
... but if we want to stop using MD5, that's not really a realistic goal
is it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improve pgbench syntax error messages

2015-03-04 Thread Fabien COELHO



Indeed. Here is a v2.


Here is a v3, which (1) activates better error messages from bison
and (2) improves the error reporting from the scanner as well.


 sh ./pgbench -f bad.sql
 bad.sql:3: syntax error at column 23 in command set
 \set aid (1021 * :id) %
   ^ error found here


the improved message is:
  bad.sql:3: syntax error, unexpected $end at column 23 in command set
  \set aid (1021 * :id) %
^ error found here

Also scanner errors are better:

  sh ./pgbench -f bad5.sql
  bad5.sql:1: unexpected character (a) at column 12 in command set
  \set foo 12abc
 ^ error found here

--
Fabien.diff --git a/contrib/pgbench/exprparse.y b/contrib/pgbench/exprparse.y
index 243c6b9..0405a50 100644
--- a/contrib/pgbench/exprparse.y
+++ b/contrib/pgbench/exprparse.y
@@ -26,6 +26,10 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 %expect 0
 %name-prefix=expr_yy
 
+/* better error reporting with bison */
+%define parse.lac full
+%define parse.error verbose
+
 %union
 {
 	int64		ival;
diff --git a/contrib/pgbench/exprscan.l b/contrib/pgbench/exprscan.l
index 4c9229c..9de7dab 100644
--- a/contrib/pgbench/exprscan.l
+++ b/contrib/pgbench/exprscan.l
@@ -18,6 +18,13 @@ static YY_BUFFER_STATE scanbufhandle;
 static char *scanbuf;
 static int	scanbuflen;
 
+/* context information for error reporting */
+static char *expr_source = NULL;
+static int expr_lineno = 0;
+static char *expr_full_line = NULL;
+static char *expr_command = NULL;
+static int expr_col = 0;
+
 /* flex 2.5.4 doesn't bother with a decl for this */
 int expr_yylex(void);
 
@@ -52,7 +59,9 @@ space			[ \t\r\f]
 
 .{
 	yycol += yyleng;
-	fprintf(stderr, unexpected character '%s'\n, yytext);
+	syntax_error(expr_source, expr_lineno, expr_full_line, expr_command,
+ unexpected character, yytext, expr_col + yycol);
+	/* dead code, exit is called from syntax_error */
 	return CHAR_ERROR;
 }
 %%
@@ -60,21 +69,27 @@ space			[ \t\r\f]
 void
 yyerror(const char *message)
 {
-	/* yyline is always 1 as pgbench calls the parser for each line...
-	 * so the interesting location information is the column number */
-	fprintf(stderr, %s at column %d\n, message, yycol);
-	/* go on to raise the error from pgbench with more information */
-	/* exit(1); */
+	syntax_error(expr_source, expr_lineno, expr_full_line, expr_command,
+ message, NULL, expr_col + yycol);
 }
 
 /*
  * Called before any actual parsing is done
  */
 void
-expr_scanner_init(const char *str)
+expr_scanner_init(const char *str, const char *source,
+  const int lineno, const char *line,
+  const char *cmd, const int ecol)
 {
 	Size	slen = strlen(str);
 
+	/* save context informations for error messages */
+	expr_source = (char *) source;
+	expr_lineno = (int) lineno;
+	expr_full_line = (char *) line;
+	expr_command = (char *) cmd;
+	expr_col = (int) ecol;
+
 	/*
 	 * Might be left over after error
 	 */
@@ -102,4 +117,9 @@ expr_scanner_finish(void)
 {
 	yy_delete_buffer(scanbufhandle);
 	pg_free(scanbuf);
+	expr_source = NULL;
+	expr_lineno = 0;
+	expr_full_line = NULL;
+	expr_command = NULL;
+	expr_col = 0;
 }
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 706fdf5..3e50bae 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -291,6 +291,7 @@ typedef struct
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 } Command;
 
@@ -2189,6 +2190,28 @@ parseQuery(Command *cmd, const char *raw_sql)
 	return true;
 }
 
+void syntax_error(const char *source, const int lineno,
+  const char *line, const char *command,
+  const char *msg, const char *more, const int column)
+{
+	fprintf(stderr, %s:%d: %s, source, lineno, msg);
+	if (more)
+		fprintf(stderr,  (%s), more);
+	if (column != -1)
+		fprintf(stderr,  at column %d, column);
+	fprintf(stderr,  in command \%s\\n, command);
+	if (line) {
+		fprintf(stderr, %s\n, line);
+		if (column != -1) {
+			int i = 0;
+			for (i = 0; i  column - 1; i++)
+fprintf(stderr,  );
+			fprintf(stderr, ^ error found here\n);
+		}
+	}
+	exit(1);
+}
+
 /* Parse a command; return a Command struct, or NULL if it's a comment */
 static Command *
 process_commands(char *buf, const char *source, const int lineno)
@@ -2200,6 +2223,11 @@ process_commands(char *buf, const char *source, const int lineno)
 	char	   *p,
 			   *tok;
 
+/* error message generation helper */
+#define SYNTAX_ERROR(msg, more, col)	\
+	syntax_error(source, lineno, my_commands-line,		\
+ my_commands-argv[0], msg, more, col)
+
 	/* Make the string buf end at the next newline */
 	if ((p = strchr(buf, '\n')) != NULL)
 		*p = '\0';
@@ -2228,11 +2256,13 @@ process_commands(char *buf, const char *source, const int 

Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Heikki Linnakangas

On 03/04/2015 08:59 PM, Stephen Frost wrote:

* Heikki Linnakangas (hlinn...@iki.fi) wrote:

The big difference between SRP and SCRAM is that if you eavesdrop
the SCRAM handshake, you can use that information to launch a
brute-force or dictionary attack. With SRP, you cannot do that. That
makes it relatively safe to use weak passwords with SRP, which is
not the case with SCRAM (nor MD5)


Thanks for the info!

Looking around a bit, one issue with SRP (as pointed out by Simon
Josefsson, the author of the SCRAM implementation for GNU SASL) is that
the username is included in the verifier (similar to our implementation
today with MD5) meaning that the stored data on the server is no longer
valid if the username is changed.  Obviously, our users are used to
that, but it's still something to be considered.


Yes, good point, that's yet another thing that needs to be considered.


One question though- isn't the iteration option to SCRAM intended to
address the dictionary/brute force risk?  SRP uses an exponentiation
instead of iterations but it's unclear to me if one is really strictly
better or worse than the other (nor have I found any discussion of that
comparison) for this vector.


Not sure what you mean. Yes, the iterations option in SCRAM is designed 
to make brute forcing more expensive. For both a captured authentication 
handshake, or if you steal a backup tape.


I'm not sure how expensive a brute force attack on SRP would be, using a 
stolen backup tape. There doesn't seem to be an iterations count similar 
to SCRAM. But note that SRP's resistance to brute-forcing the 
authentication handshake is of a different kind. It's not just 
expensive, but outright impossible. (Don't ask me how that works; I'm 
not well-versed in the maths involved.) That's a big advantage because 
it means that it's OK to use a fairly weak password like 'foobar123' 
that would be trivially cracked with a dictionary attack. (You can still 
connect to the server and try different passwords, but the server can 
log that and throttle how many guesses / minute it let's you do)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea: closing the loop for pg_ctl reload

2015-03-04 Thread Peter Eisentraut
On 3/3/15 7:34 PM, Jim Nasby wrote:
 Definitely no multi-line. If we keep that restriction, couldn't we just
 dedicate one entire line to the error message? ISTM that would be safe.

But we have multiline error messages.  If we put only the first line in
the pid file, then all the tools that build on this will effectively
show users truncated information, which won't be a good experience.  If
we don't put any error message in there, then users or tools will be
more likely to look up the full message somewhere else.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Robert Haas
On Wed, Mar 4, 2015 at 10:52 AM, Stephen Frost sfr...@snowman.net wrote:
 I've been discussing this with a few folks outside of the PG community
 (Debian and Openwall people specifically) and a few interesting ideas
 have come out of that which might be useful to discuss.

 The first is a don't break anything approach which would move the
 needle between network data sensitivity and on-disk data sensitivity
 a bit back in the direction of making the network data more sensitive.

 this approach looks like this: pre-determine and store the values (on a
 per-user basis, so a new field in pg_authid or some hack on the existing
 field) which will be sent to the client in the AuthenticationMD5Password
 message.  Further, calculate a random salt to be used when storing data
 in pg_authid.  Then, for however many variations we feel are necessary,
 calculate and store, for each AuthenticationMD5Password value:

 md5_challenge, hash(salt || response)

 We wouldn't store 4 billion of these, of course, which means that the
 challenge / response system becomes less effective on a per-user basis.
 We could, however, store X number of these and provide a lock-out
 mechanism (something users have asked after for a long time..) which
 would make it likely that the account would be locked before the
 attacker was able to gain access.  Further, an attacker with access to
 the backend still wouldn't see the user's cleartext password, nor would
 we store the cleartext password or a token in pg_authid which could be
 directly used for authentication, and we don't break the wireline
 protocol or existing installations (since we could detect that the
 pg_authid entry has the old-style and simply 'upgrade' it).

So, the server can only authenticate the user with the salts it has
stored, because those are the only salts for which it knows what the
response should be?  But then if somebody steels pg_authid, they'll
have the answers to the exact same set of questions that the server
knows how to ask.  And on top of that, replay attacks become massively
easier.  Any value you pick for X is going to be many orders of
magnitude smaller than 4 billion, and if it's not entirely trivial
you'll also have a huge expansion of the size of a pg_authid row.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 So, the server can only authenticate the user with the salts it has
 stored, because those are the only salts for which it knows what the
 response should be?  

That's correct- except that it doesn't directly know what the response
is, it only knows what the result of hashing the response with the
information provided by the client is.

 But then if somebody steels pg_authid, they'll
 have the answers to the exact same set of questions that the server
 knows how to ask.  

Not quite.  They can't provide what's in pg_authid as an authentication
token in this case; that's the main thrust of this change from what is
in pg_authid today.  They would know which of the challenges are able to
be sent to the client and the result of hashing that with what they
actually need to send as an auth token to the server, but that's no
easier than today's unix password-based /etc/shadow approach where even
if you know the answer (in the form of the salt and the resulting
hash), you can't trivially provide the token needed to authenticate (the
password).

 And on top of that, replay attacks become massively
 easier.  Any value you pick for X is going to be many orders of
 magnitude smaller than 4 billion, and if it's not entirely trivial
 you'll also have a huge expansion of the size of a pg_authid row.

This is correct, as I was discussing with Bruce, replay attacks would
certainly require fewer attempts, though it's not like it'd take long to
do ~16k connection attempts today (how many you'd actually need to
attempt to have a high probability of getting in successfully with our
existing 4-byte salt).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Robert Haas
On Wed, Mar 4, 2015 at 4:11 PM, Peter Eisentraut pete...@gmx.net wrote:
 That said, if I understand your test case correctly, this would
 basically be an argument against any semantic corrections in stable
 releases, since user code could expect to continue to work with the
 previous incorrect value.

 You can always test the server version number, and you'll have to for
 the next major release anyway.

That's not really the problem, of course.  The problem is you upgrade
and your app unexpectedly breaks.  The complexity of fixing that once
it's happened is not entirely irrelevant, but it's not really the main
problem, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea: closing the loop for pg_ctl reload

2015-03-04 Thread Alvaro Herrera
Peter Eisentraut wrote:
 On 3/3/15 7:34 PM, Jim Nasby wrote:
  Definitely no multi-line. If we keep that restriction, couldn't we just
  dedicate one entire line to the error message? ISTM that would be safe.
 
 But we have multiline error messages.  If we put only the first line in
 the pid file, then all the tools that build on this will effectively
 show users truncated information, which won't be a good experience.  If
 we don't put any error message in there, then users or tools will be
 more likely to look up the full message somewhere else.

Would it work to have it be multiline using here-doc syntax?  It could
be printed similarly to what psql does to error messages, with a prefix
in front of each component (ERROR, STATEMENT, etc) and a leading tab for
continuation lines.  The last line is a terminator that matches
something specified in the first error line.  This becomes pretty
complicated for a PID file, mind you.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea: closing the loop for pg_ctl reload

2015-03-04 Thread Andres Freund
On 2015-03-04 19:04:02 -0300, Alvaro Herrera wrote:
 This becomes pretty complicated for a PID file, mind you.

Yes.

Let's get the basic feature (notification of failed reloads) done
first. That will be required with or without including the error
message.  Then we can get more fancy later, if somebody really wants to
invest the time.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-04 Thread Andreas Karlsson

On 01/29/2015 12:28 AM, Peter Geoghegan wrote:

On Mon, Jan 26, 2015 at 11:21 PM, Andreas Karlsson andr...@proxel.se wrote:

Do you also think the SQL functions should be named numeric_int128_sum,
numeric_int128_avg, etc?


Some quick review comments. These apply to int128-agg-v5.patch.

* Why is there no declaration of the function
numeric_int16_stddev_internal() within numeric.c?


Because there is no declaration of numeric_stddev_internal() either. If 
there should be I could add declarations for both.



* I concur with others - we should stick to int16 for the SQL
interface. The inconsistency there is perhaps slightly confusing, but
that's historic.


Agreed.


* I'm not sure about the idea of polymorphic catalog functions (that
return the type internal, but the actual struct returned varying
based on build settings).

I tend to think that things would be better if there was always a
uniform return type for such internal type returning functions, but
that *its* structure varied according to the availability of int128
(i.e. HAVE_INT128) at compile time. What we should probably do is have
a third aggregate struct, that encapsulates this idea of (say)
int2_accum() piggybacking on one of either Int128AggState or
NumericAggState directly. Maybe that would be called PolyNumAggState.
Then this common code is all that is needed on both types of build (at
the top of int4_accum(), for example):

PolyNumAggState *state;

state = PG_ARGISNULL(0) ? NULL : (PolyNumAggState *) PG_GETARG_POINTER(0);

I'm not sure if I'd do this with a containing struct or a simple
#ifdef HAVE_INT128 typedef #else ... , but I think it's an
improvement either way.


Not entirely sure exactly what I mean but I have changed to something 
like this, relying on typedef, in the attached version of the patch. I 
think it looks better after the changes.



* You didn't update this unequivocal comment to not be so strong:

  * Integer data types all use Numeric accumulators to share code and
  * avoid risk of overflow.


Fixed.

I have attached a new version of the patch which fixes the issues above 
plus moves the autoconf code to the right place (from configure.in to 
c-compiler.m4).


--
Andreas Karlsson
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 509f961..cea74e1 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -125,6 +125,41 @@ undefine([Ac_cachevar])dnl
 ])# PGAC_TYPE_64BIT_INT
 
 
+# PGAC_TYPE_64BIT_INT(TYPE)
+# -
+# Check if __int128 is a working 128 bit integer type and
+# define HAVE___INT128 if so.
+AC_DEFUN([PGAC_HAVE___INT128],
+[AC_CACHE_CHECK(for __int128, pgac_cv_have___int128,
+[AC_TRY_RUN([
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does_int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+main() {
+  exit(! does_int128_work());
+}],
+[pgac_cv_have___int128=yes],
+[pgac_cv_have___int128=no],
+[# If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+AC_COMPILE_IFELSE([AC_LANG_BOOL_COMPILE_TRY([], [sizeof(__int128) == 16])],
+  pgac_cv_have___int128=yes,
+  pgac_cv_have___int128=no)])])
+
+if test x$pgac_cv_have___int128 = xyes ; then
+  AC_DEFINE(HAVE___INT128, 1, [Define to 1 if the system has the type `__int128'.])
+fi])# PGAC_TYPE_64BIT_INT
+
 
 # PGAC_C_FUNCNAME_SUPPORT
 # ---
diff --git a/configure b/configure
index fa271fe..1cd964d 100755
--- a/configure
+++ b/configure
@@ -13773,6 +13773,74 @@ _ACEOF
 fi
 
 
+{ $as_echo $as_me:${as_lineno-$LINENO}: checking for __int128 5
+$as_echo_n checking for __int128...  6; }
+if ${pgac_cv_have___int128+:} false; then :
+  $as_echo_n (cached)  6
+else
+  if test $cross_compiling = yes; then :
+  # If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+static int test_array [1 - 2 * !(sizeof(__int128) == 16)];
+test_array [0] = 0;
+return test_array [0];
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile $LINENO; then :
+  pgac_cv_have___int128=yes
+else
+  pgac_cv_have___int128=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+else
+  cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does_int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+main() {
+  exit(! does_int128_work());
+}
+_ACEOF
+if ac_fn_c_try_run $LINENO; then :
+  pgac_cv_have___int128=yes
+else
+  pgac_cv_have___int128=no
+fi
+rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
+  conftest.$ac_objext conftest.beam conftest.$ac_ext
+fi
+
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: $pgac_cv_have___int128 5
+$as_echo $pgac_cv_have___int128 6; }
+
+if test 

Re: [HACKERS] Additional role attributes superuser review

2015-03-04 Thread Peter Eisentraut
On 2/28/15 10:10 PM, Stephen Frost wrote:
 Adam,
 
 * Adam Brightwell (adam.brightw...@crunchydatasolutions.com) wrote:
 I have attached and updated patch for review.
 
 Thanks!  I've gone over this and made quite a few documentation and
 comment updates, but not too much else, so I'm pretty happy with how
 this is coming along.  As mentioned elsewhere, this conflicts with the
 GetUserId() to has_privs_of_role() cleanup, but as I anticipate handling
 both this patch and that one, I'll find some way to manage. :)
 
 Updated patch attached.  Barring objections, I'll be moving forward with
 this soonish.  Would certainly appreciate any additional testing or
 review that you (or anyone!) has time to provide.

Let's move this discussion to the right thread.

Why are we not using roles and function execute privileges for this?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] anyarray

2015-03-04 Thread Peter Eisentraut
On 2/13/15 10:20 AM, Teodor Sigaev wrote:
 Some of users of intarray contrib module wish to use its features with
 another kind of arrays, not only for int4 type. Suggested module
 generalizes intarray over other (not all) types op pgsql.
 
 Anyarray also provides a calculation of similarity two one dimensinal
 arrays similar to smlar module. Anyarray module doesn't provide an
 something similar to query_int feature of intarray, because this feature
 is very hard to implement (it requires new pseudo-type anyquery), it is
 close to impossible to have operation extensibility and it's complicated
 queries are hidden from pgsql's optimizer (like to jsquery). As far I
 know, query_int isn't very popular for now.

intarray has its uses, so it's not hard to believe that someone will
want, say, a bigint array instead.  So generalizing this is useful.

I think this module should be merged with the intarray module.  Having
two modules with very similar functionality would be confusing.

Moreover, the two modules provide almost, but not quite, the same set of
operators.  You mentioned that the query stuff is probably not that
useful, but the overlaps, contains, and contained operators probably
are, and they look like they could be generalized.

I'm not sure about the similarity stuff.  No explanation or
documentation is provided for what it's useful for or what method to choose.

GiST and GIN support is only provided for a finite set of built-in
types, and there is no documentation for how to extend this to other
types.  The GiST and GIN support is the main point of intarray.  If
anyarray can only support a finite set of hardcoded types, then it's not
really that any.

If we can't do better, then I'd rather update the intarray module to
provide polymorphic functions and add index support for, say, bigint and
any other type that a good case can be made for.  I don't think
_money_aa_ops for example is going to have many users.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] anyarray

2015-03-04 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 2/13/15 10:20 AM, Teodor Sigaev wrote:
 Some of users of intarray contrib module wish to use its features with
 another kind of arrays, not only for int4 type. Suggested module
 generalizes intarray over other (not all) types op pgsql.

 I think this module should be merged with the intarray module.  Having
 two modules with very similar functionality would be confusing.

Perhaps.  I think it would be hard to remove intarray without breaking
things for existing users of it; even if the functionality remains under
another name.  And surely we don't want to generalize intarray while
keeping that same name.  So it might be hard to get to a clean solution.

Speaking of names, I can't avoid the feeling that it is a seriously bad
idea to name an extension the same thing as an existing core type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CATUPDATE confusion?

2015-03-04 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Any other opinions?
 The options are:
 0) Leave as is.
 1) Remove catupdate and replace with superuser checks.
 2) Remove catupdate and rely on regular table permissions on catalogs.

I think I vote for (1).  I do not like (2) because of the argument I made
upthread that write access on system catalogs is far more dangerous than
a naive DBA might think.  (0) has some attraction but really CATUPDATE
is one more concept than we need.

On the other hand, if Stephen is going to push forward with his plan to
subdivide superuserness, we might have the equivalent of CATUPDATE right
back again.  (But at least it would be properly documented and
supported...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weirdly pesimistic estimates in optimizer

2015-03-04 Thread David Kubečka
Hi Tomas and others,

2015-03-02 21:29 GMT+01:00 Tomas Vondra tomas.von...@2ndquadrant.com:

 Hi David ;-)

 On 2.3.2015 20:19, David Kubečka wrote:
 
  The question is why optimizer, or rather the cost estimator,
  produced so much different estimates upon very small change in input.
  Moreover it seems that the main culprit of bad estimates isn't
  actually directly related to outer table, but rather to inner table.
  Just compare estimates for the two index scans:
 
  With 'random_fk_dupl':
   -  Index Scan using facts_fk_idx on facts  (cost=0.42..5.75
  rows=100 width=15) (actual time=0.009..0.117 rows=98 loops=100)
  With 'random_fk_uniq':
   -  Index Scan using facts_fk_idx on facts  (cost=0.42..214.26
  rows=100 width=15) (actual time=0.007..0.109 rows=98 loops=100)
 
  I have read the optimizer README file and also looked briefly at the
   code, but this seems to be something not related to particular
  implementation of algorithm (e.g. nested loop). Perhaps it's the way
  how cost estimates are propagated down (or sideways? that would be
  weird...) the query tree. But I am really not sure, since this is my
  first time lookng at the optimizer code base. I should also add that
  I have reproduced this behaviour for all versions of Pg from 9.2 up
  to current devel.

 Interesting. I've only spent a few minutes looking at this, but it
 certainly is a bit strange that using smaller table with unique values
 results in a slower plan than using a table with duplicities (but the
 same number of unique values).


Yep. And I think that I have found the cause of the strangeness. I have
traced the planner code and this is how cost_index from costsize.c is
called when computing the cost of index scan on outer (facts) table, when
the inner table is with duplicate entries (random_fk_dupl):

cost_index (path=0x27f8fb8, root=0x27670e8, loop_count=9920)

The important thing here is the value of loop_count which is the (perfect
estimate of) number of rows of random_fk_dupl (I have recreated the table
so it differs slightly from previously posted version (9893)). Now the
predominant part of running cost is computed by this expression:

run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);

Since csquared (indexCorrelation * indexCorrelation) is very small in this
case, the biggest term here is max_IO_cost, which is computed as follows:

max_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;

There is division by loop_count because of predicted effect of caching and
it is exactly this division which makes the run_cost for single index
lookup so low compared with the query version with random_fk_uniq.  So the
main problem is that the planner calls cost_index with loop_count equal to
number of rows of inner table, *but it ignores semi-join semantics*, i.e.
it doesn't account for the number of *unique* rows in the inner table which
will actually be the number of loops.

Since this is my first time looking into the optimizer (and in fact any
postgres) code I am not yet able to locate the exact place where this
should be repaired, but I hope that in few days I will have a patch :-)


 Another observation is that the planner does see other plans, because
 tweaking the cost variables like this:

 SET random_page_cost = 2;

 produces this plan (which on my machine runs just a tad slower than the
 first query in your post):

 QUERY PLAN
 -
  HashAggregate  (cost=10564.81..10688.47 rows=9893 width=15)
Group Key: facts.fk
-  Nested Loop  (cost=5.42..10515.34 rows=9893 width=15)
-  HashAggregate  (cost=2.24..3.23 rows=99 width=4)
  Group Key: random_fk_uniq.fk
  -  Seq Scan on random_fk_uniq  (cost=0.00..1.99 ...)
-  Bitmap Heap Scan on facts  (cost=3.18..105.18 rows=100 ...)
Recheck Cond: (fk = random_fk_uniq.fk)
-  Bitmap Index Scan on facts_fk_idx  (cost=0.00..3.15 ...)
  Index Cond: (fk = random_fk_uniq.fk)


Yeah, this makes now perfect sense to me. From the run_cost and max_IO_cost
formulas you can see that (in this case!) random_page_cost is almost
exactly the linear factor here. So eventually the results are completely
opposite than I wished at first, i.e. instead of tweaking postgres to
produce lower costs for random_fk_uniq it will start producing higher costs
for random_fk_dupl. But I now believe that this is correct.

Regards,

-David


 I can get similar results by setting cpu_operator_cost=0.005. And
 further lowering to random_page_cost to 1.1 actually gets me this:

 QUERY PLAN
 -
  HashAggregate  (cost=6185.41..6309.08 rows=9893 width=15)
Group Key: facts.fk
-  Nested Loop  (cost=2.66..6135.95 rows=9893 width=15)
   -  HashAggregate  (cost=2.24..3.23 rows=99 width=4)
  

Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 03:59:02PM -0500, Stephen Frost wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
  Bruce Momjian br...@momjian.us writes:
   Let me update my list of possible improvements:
  
   1)  MD5 makes users feel uneasy (though our usage is mostly safe)
  
   2)  The per-session salt sent to the client is only 32-bits, meaning
   that it is possible to reply an observed MD5 hash in ~16k connection
   attempts.
  
   3)  Using the user name for the MD5 storage salt allows the MD5 stored
   hash to be used on a different cluster if the user used the same
   password.
  
   4)  Using the user name for the MD5 storage salt allows the MD5 stored
   hash to be used on the _same_ cluster.
  
   5)  Using the user name for the MD5 storage salt causes the renaming of
   a user to break the stored password.
  
  What happened to possession of the contents of pg_authid is sufficient
  to log in?  I thought fixing that was one of the objectives here.
 
 Yes, it certainly was.  I think Bruce was thinking that we could simply
 hash what goes on to disk with an additional salt that's stored, but
 that wouldn't actually work without requiring a change to the wireline
 protocol, which is the basis of this entire line of discussion, in my
 view.

I was not really focused on needing or not needing wire protocol
changes, but rather trying to understand the attack vectors and how they
could be fixed, in general.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-03-04 Thread Peter Eisentraut
On 3/3/15 5:58 PM, Tom Lane wrote:
 Stephen Frost sfr...@snowman.net writes:
 It's not a documented policy but it's certainly what a whole slew of
 functions *do*.  Including pg_start_backup, pg_stop_backup,
 pg_switch_xlog, pg_rotate_logfile, pg_create_restore_point,
 pg_xlog_replay_pause, lo_import, lo_export, and pg_xlog_replay_resume,
 pg_read_file, pg_read_text_file, pg_read_binary_file, and pg_ls_dir.
 
 Indeed, it's the larger portion of what the additional role attributes
 are for.  I'm not entirely thrilled to hear that nearly all of that work
 might be moot, but if that's the case then so be it and let's just
 remove all those superuser checks and instead revoke EXECUTE from public
 and let the superuser grant out those rights.
 
 It does seem like that could be an easier and more flexible answer than
 inventing a pile of role attributes.

One issue is that privilege changes on built-in stuff (and extensions)
are not dumped.  But that is fixable.

 One aspect of this that merits some thought is that in some cases
 access to some set of functions is best granted as a unit.  That's
 easy with role properties but much less so with plain GRANT.
 Do we have enough such cases to make it an issue?

You could have built-in roles, such as backup and ship the system with
the backup role having permissions on some functions.  And then users
are granted those roles.  Similar to how some Linux systems ship with
groups such as adm.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath changes in the recent back branches

2015-03-04 Thread Peter Eisentraut
On 3/4/15 5:00 PM, Robert Haas wrote:
 You can always test the server version number, and you'll have to for
 the next major release anyway.
 
 That's not really the problem, of course.  The problem is you upgrade
 and your app unexpectedly breaks.  The complexity of fixing that once
 it's happened is not entirely irrelevant, but it's not really the main
 problem, either.

That paragraph was mainly a response to the claim that there is no way
to make it work in both old and new versions.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-03-04 Thread Peter Eisentraut
On 3/3/15 5:29 PM, Stephen Frost wrote:
 For my part, I understood that we specifically didn't want to allow that
 for the same reason that we didn't want to simply depend on the GRANT
 system for the above functions, but everyone else on these discussions
 so far is advocating for using the GRANT system.  My memory might be
 wrong, but I could have sworn that I had brought up exactly that
 suggestion in years past only to have it shot down.

I think a lot of this access control work is done based on some
undocumented understandings, when in fact there is no consensus on
anything.  I think we need more clarity.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 03:54:09PM -0500, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Let me update my list of possible improvements:
 
  1)  MD5 makes users feel uneasy (though our usage is mostly safe)
 
  2)  The per-session salt sent to the client is only 32-bits, meaning
  that it is possible to reply an observed MD5 hash in ~16k connection
  attempts.
 
  3)  Using the user name for the MD5 storage salt allows the MD5 stored
  hash to be used on a different cluster if the user used the same
  password.
 
  4)  Using the user name for the MD5 storage salt allows the MD5 stored
  hash to be used on the _same_ cluster.
 
  5)  Using the user name for the MD5 storage salt causes the renaming of
  a user to break the stored password.
 
 What happened to possession of the contents of pg_authid is sufficient
 to log in?  I thought fixing that was one of the objectives here.

That is #4.

Addressing Stephen's ideas, I question whether there are enough people
who care about fixing #3 and #4 to require the use of TLS to fix it.  It
would be nice if we knew if the system was only using TLS, but we can't,
so we would need a GUC for the administrator to choose always-TLS to fix
#3 and #4.

One way to fix #2 would be to use a per-user or per-cluster counter for
the session salt, rather than a random number --- that would change
replays from ~16k to 4 billion, with no wire protocol change needed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] a2e35b53 added unused variable to ConversionCreate()

2015-03-04 Thread Peter Geoghegan
I'm seeing this on a the master branch tip when building at -O2:

pg_conversion.c: In function ‘ConversionCreate’:
pg_conversion.c:53:6: error: variable ‘oid’ set but not used
[-Werror=unused-but-set-variable]
  Oid   oid;
  ^
cc1: all warnings being treated as errors

I think that commit a2e35b53 did this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Josh Berkus
Catching up here ...

On 03/03/2015 06:01 PM, Bruce Momjian wrote:
 It feels like MD5 has accumulated enough problems that we need to start
 looking for another way to store and pass passwords.  The MD5 problems
 are:
 
 1)  MD5 makes users feel uneasy (though our usage is mostly safe) 
 
 2)  The per-session salt sent to the client is only 32-bits, meaning
 that it is possible to reply an observed MD5 hash in ~16k connection
 attempts.

Seems like we could pretty easily increase the size of the salt.  Of
course, that just increases the required number of connection attempts,
without really fixing the problem.

 3)  Using the user name for the MD5 storage salt allows the MD5 stored
 hash to be used on a different cluster if the user used the same
 password. 

This is a feature as well as a bug. For example, pgBouncer relies on
this aspect of md5 auth.

 4)  Using the user name for the MD5 storage salt causes the renaming of
 a user to break the stored password.

Wierdly, in 17 years of Postgres, I've never encountered this issue.

So, are we more worried about attackers getting a copy of pg_authid, or
sniffing the hash on the wire?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 05:56:25PM -0800, Josh Berkus wrote:
 Catching up here ...
 
 On 03/03/2015 06:01 PM, Bruce Momjian wrote:
  It feels like MD5 has accumulated enough problems that we need to start
  looking for another way to store and pass passwords.  The MD5 problems
  are:
  
  1)  MD5 makes users feel uneasy (though our usage is mostly safe) 
  
  2)  The per-session salt sent to the client is only 32-bits, meaning
  that it is possible to reply an observed MD5 hash in ~16k connection
  attempts.
 
 Seems like we could pretty easily increase the size of the salt.  Of
 course, that just increases the required number of connection attempts,
 without really fixing the problem.
 
  3)  Using the user name for the MD5 storage salt allows the MD5 stored
  hash to be used on a different cluster if the user used the same
  password. 
 
 This is a feature as well as a bug. For example, pgBouncer relies on
 this aspect of md5 auth.
 
  4)  Using the user name for the MD5 storage salt causes the renaming of
  a user to break the stored password.
 
 Wierdly, in 17 years of Postgres, I've never encountered this issue.
 
 So, are we more worried about attackers getting a copy of pg_authid, or
 sniffing the hash on the wire?

Both.  Stephen is more worried about pg_authid, but I am more worried
about sniffing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add pg_settings.pending_restart column

2015-03-04 Thread Peter Eisentraut
On 2/17/15 10:45 AM, Robert Haas wrote:
 You don't really need the else here, and in parallel cases:
 
  if (*conf-variable != newval)
  {
 +record-status |= GUC_PENDING_RESTART;
  ereport(elevel,
  (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
   errmsg(parameter \%s\ cannot be
 changed without restarting the server,
  name)));
  return 0;
  }
 +else
 +record-status = ~GUC_PENDING_RESTART;
  return -1;
 
 The if-statement ends with return 0 so there is no reason for the else.

I kind of liked the symmetry of if/else, but I can change it.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help

2015-03-04 Thread Bruce Momjian
On Wed, Mar  4, 2015 at 04:19:00PM -0500, Stephen Frost wrote:
  Hm, well, don't change the wireline protocol could be another wanna-have
  ... but if we want to stop using MD5, that's not really a realistic goal
  is it?
 
 I'm trying to address both sides of the issue- improve the current
 situation without breaking existing clients AND provide a new auth
 method and encourage everyone to move to it as soon as they're able.  We
 can't simply deprecate md5 as that would break pg_upgrade for any users
 who are currently using it.

Actually, pg_upgrade uses 'trust' and a private socket file, at least on
Unix.  Of course, post-upgrade, users would have trouble logging in.

 This half of the discussion has been all about improving md5 without
 breaking the wireline protocol or existing users.
 
 The other half of the discussion is about implementing a new
 password-based authentication based on one of the vetted authentication
 protocols already in use today (SCRAM or SRP, for example).  Using those
 new authentication protocols would include a move off of the MD5 hashing
 function, of course.  This would also mean breaking the on-disk hash,
 but that's necessary anyway because what we do there today isn't secure
 either and no amount of futzing is going to change that.

While I don't like the requirement to use TLS to improve MD5 fix, I also
don't like the idea of having users go through updating all these
passwords only to have us implement the _right_ solution in the next
release.  I don't see why it is useful to be patching up MD5 with a TLS
requirement when we know they should be moving to SCRAM or SRP.  If the
MD5 change was transparent to users/admins, that would be different.

 I've got nearly zero interest in trying to go half-way on this by
 designing something that we think is secure which has had no external
 review or anyone else who uses it.  Further, going that route makes me
 very nervous that we'd decide on certain compromises in order to make
 things easier for users without actually realising the problems with
 such an approach (eg: well, if we use hack X we wouldn't have to
 change what is stored on disk and therefore we wouldn't break

I am not happy to blindly accept a new security setup without
understanding exactly what it is trying to fix, which is why I am asking
all these questions.

 pg_upgrade..).  I'd *much* rather have a clean break and migration
 path for users.  If we had a password rotation capability then this kind

Yes, I think we are now saying the same thing.

 of a transistion would be a lot easier on our users and I'd definitely
 recommend that we add that with this new authentication mechanism, to
 address this kind of issue in the future (not to mention that's often
 asked for..).  Password complexity would be another thing we should
 really add and is also often requested.

I agree our password management could use improvement.

 Frankly, in the end, I don't see us being able to produce something in
 time for this release unless someone can be dedicated to the effort over
 the next couple of months, and therefore I'd prefer to improve the
 current issues with md5 without breaking the wireline protocol than
 simply do nothing (again).

I am not sure why we have to shove something into 9.5 --- as you said,
this issue has been known about for 10+ years.

I think we should do what we can to improve MD5 in cases where the
user/admin needs to take no action, and then move to add a better
authentication method.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join push-down support for foreign tables

2015-03-04 Thread Kouhei Kaigai
 Here is v4 patch of Join push-down support for foreign tables.  This
 patch requires Custom/Foreign join patch v7 posted by Kaigai-san.
 
Thanks for your efforts,

 In this version I added check about query type which gives up pushing
 down joins when the join is a part of an underlying query of
 UPDATE/DELETE.

 As of now postgres_fdw builds a proper remote query but it can't bring
 ctid value up to postgresExecForeignUpdate()...

The ctid reference shall exist as an usual column reference in
the target-list of join-rel. It is not origin of the problem.

See my investigation below. I guess special treatment on whole-row-
reference is problematic, rather than ctid.

 How to reproduce the error, please execute query below after running
 attached init_fdw.sql for building test environment.  Note that the
 script drops user1, and creates database fdw and pgbench.
 
 fdw=# explain (verbose) update pgbench_branches b set filler = 'foo'
 from pgbench_tellers t where t.bid = b.bid and t.tid  10;
 
  QUERY
 PLAN
 
 
 
 
 
 ---
  Update on public.pgbench_branches b  (cost=100.00..100.67 rows=67 width=390)
Remote SQL: UPDATE public.pgbench_branches SET filler = $2 WHERE ctid = $1
-  Foreign Scan  (cost=100.00..100.67 rows=67 width=390)
  Output: b.bid, b.bbalance, 'foo
 
 '::character(88), b.ctid, *
  Remote SQL: SELECT r.a_0, r.a_1, r.a_2, l FROM (SELECT tid,
 bid, tbalance, filler FROM public.pgbench_tellers WHERE ((tid  10)))
 l (a_0, a_1) INNER JOIN (SELECT b
 id, bbalance, NULL, ctid FROM public.pgbench_branches FOR UPDATE) r
 (a_0, a_1, a_2, a_3)  ON ((r.a_0 = l.a_1))
 (5 rows)
 fdw=# explain (analyze, verbose) update pgbench_branches b set filler
 = 'foo' from pgbench_tellers t where t.bid = b.bid and t.tid  10;
 ERROR:  ctid is NULL
 
It seems to me the left relation has smaller number of alias definitions
than required. The left SELECT statement has 4 target-entries, however,
only a_0 and a_1 are defined.
The logic to assign aliases of relation/column might be problematic.
Because deparseColumnAliases() add aliases for each target-entries of
underlying SELECT statement, but skips whole-row0-reference.
On the other hands, postgres_fdw takes special treatment on whole-
row-reference. Once a whole-row-reference is used, postgres_fdw
put all the non-system columns on the target-list of remote SELECT
statement.
Thus, it makes mismatch between baserel-targetlist and generated
aliases.

I think we have two options:
(1) Stop special treatment for whole-row-reference, even if it is
simple foreign-scan (which does not involve join).
(2) Add a new routine to reconstruct whole-row-reference of
a particular relation from the target-entries of joined relations.

My preference is (2). It can keep the code simple, and I doubt
whether network traffic optimization actually has advantage towards
disadvantage of local tuple reconstruction (which consumes additional
CPU cycles).

Does it make sense for you?
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add pg_settings.pending_restart column

2015-03-04 Thread Peter Eisentraut
On 2/15/15 3:41 AM, David G Johnston wrote:
 Otherwise it seems fine but I cannot help but feel that false positives are
 possible; though nothing that doesn't already exist.  Mainly, is the change
 going to end up only affect the reset or default value but not the currently
 active value?

I don't understand what you mean by this.  We already have the logic to
detect the situation concerned.  I'm only extending the reporting.  Of
course, if you can think of a case where it doesn't work correctly,
let's hear it.

 Instead of a boolean I'd rather have a string/enum that can capture the fact
 that a reboot is required and the expected outcome.  The same field can then
 communicate non-reboot stuff too - like SIGHUP required or session
 scope.
 
 A simple reboot required boolean api could just as easily be done via a
 function; and why limit to just reboots and not reconnection or SIGHUP
 required?
 
 Scope creeping but the reboot case doesn't seem that special overall; other
 than the effort needed to realize the updated value.

It is special because we apply the context restrictions differently in
this case.  Normally, when you try to set a parameter that you can't
set, you get an error.  But in the case of restart-only, we just ignore
it (and log a message).  The point here is to make that more easily
detectable.

What you describe sounds more like that you want to see the complete
stack of overridden and possibly overriding settings.  That's a bit of a
different feature, I think.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea: closing the loop for pg_ctl reload

2015-03-04 Thread Jan de Visser
On March 4, 2015 11:08:09 PM Andres Freund wrote:
 Let's get the basic feature (notification of failed reloads) done
 first. That will be required with or without including the error
 message.  Then we can get more fancy later, if somebody really wants to
 invest the time.

Except for also fixing pg_reload_conf() to pay attention to what happens with 
postmaster.pid, the patch I submitted does exactly what you want I think.

And I don't mind spending time on stuff like this. I'm not smart enough to deal 
with actual, you know, database technology.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-03-04 Thread Peter Eisentraut
On 3/2/15 4:47 PM, Robert Haas wrote:
 On Sat, Feb 28, 2015 at 12:27 AM, Stephen Frost sfr...@snowman.net wrote:
 While this generally works, the usual expectation is that functions
 that should be superuser-only have a check in the function rather than
 depending on the execute privilege.  I'm certainly happy to debate the
 merits of that approach, but for the purposes of this patch, I'd suggest
 you stick an if (!superuser()) ereport(must be superuser) into the
 function itself and not work about setting the correct permissions on
 it.
 
 -1.  If that policy exists at all, it's a BAD policy, because it
 prevents users from changing the permissions using DDL.  I think the
 superuser check should be inside the function, when, for example, it
 masks some of the output data depending on the user's permissions.
 But I see little virtue in handicapping an attempt by a superuser to
 grant SELECT rights on pg_file_settings.

This is in fact how most if not all code ensures supervisor-only access
to functions, so for the purpose of this patch, I think it is the
correct approach.  Someone may well change that soon after, if the other
ongoing efforts conclude.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CATUPDATE confusion?

2015-03-04 Thread Peter Eisentraut
Any other opinions?

The options are:

0) Leave as is.

1) Remove catupdate and replace with superuser checks.

2) Remove catupdate and rely on regular table permissions on catalogs.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] forward vs backward slashes in msvc build code

2015-03-04 Thread Andrew Dunstan


On 03/04/2015 10:37 PM, Peter Eisentraut wrote:

On 2/15/15 6:55 AM, Michael Paquier wrote:

I tested quickly the second patch with MS 2010 and I am getting a
build failure: chkpass cannot complete because of crypt missing. On
master build passes. More details here:
C:\Users\mpaquier\git\postgres\pgsql.sln (default target) (1) -
C:\Users\mpaquier\git\postgres\chkpass.vcxproj (default target) (36) -
(Link target) -
   chkpass.obj : error LNK2019: unresolved external symbol crypt
referenced in function chkpass_in
[C:\Users\ioltas\git\postgres\chkpass.vcxproj]
   .\Release\chkpass\chkpass.dll : fatal error LNK1120: 1 unresolved
externals [C:\Users\mpaquier\git\postgres\chkpass.vcxproj]

I can't tell from just looking at the code how chkpass would normally
find crypt.  The msvc tools neither parse SHLIB_LINK nor have hardcoded
knowledge.  Any idea?




Which library is it in? There are sections at the top of Mkvcbuild.pm 
for including various libraries in contrib modules that need them.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-03-04 Thread Vladimir Borodin

 4 марта 2015 г., в 19:28, Bruce Momjian br...@momjian.us написал(а):
 
 On Wed, Mar  4, 2015 at 01:53:47PM +0300, Vladimir Borodin wrote:
After running initdb to create the new master, but before running
pg_upgrade, modify the new master's postgresql.conf and change wal_level
= hot_standby.  (Don't modify pg_hba.conf at this stage.)
 
 
 
 That does not help. The reason is that pg_upgrade sets 'Current wal_level
 setting: minimal' in control-file, and it does not depend on what is set in
 postgresql.conf before running pg_upgrade. Details could be seen here - 
 http://
 pastie.org/9998671.
 
 Well, what is happening is that the pg_resetxlog commands we run inside
 pg_upgrade set the pg_controldata file's wal_level to minimal, but as
 you saw, starting the server sets the pg_controldata properly. 
 pg_resetxlog is not changing the WAL files at all, just the control
 file.
 
 The workaround for this is to start  and cleanly shut down postgres on master
 after running pg_upgrade but before running rsync. After that there would be 
 a
 good control-file for streaming replication and rsync to replica can be done.
 
 You are correct that a pg_controldata file is copied over that has
 wal_level=minimal, but that should not be a problem.

I suppose, this is the root cause of why replica does not start as hot standby. 
It it enough to start it as warm standby, but not hot standby. See 
CheckRequiredParameterValues function in xlog.c which is called inside of 
StartupXLOG function.

 
 But it could not be done with --size-only key, because control-file is of 
 fixed
 size and rsync would skip it. Or may be everything should be copied with
 --size-only and control-file should be copied without this option.
 
 Well, what happens is that there is no _new_ standby pg_controldata
 file, so it is copied fully from the new master.  Are you running initdb
 to create the new standby --- you shouldn't be doing that as the rsync
 will do that for you.  

No, I don’t. The scenario of the problem with copying control-file was in case 
when I:
1. ran pg_upgrade on master and got control-file with wal_level = minimal,
2. did rsync --size-only to replica (and it got this control-file with 
wal_level = minimal),
3. started and stopped postgres on master to get «good» control-file with  
wal_level = hot_standby»,
4. did rsync --size-only to replica one more time. And this time control-file 
is not copied because of the same size of control-file.

Actually, if you don’t do step 2, everything works as expected. Sorry for 
bothering you.

 Also, are you cleanly shutting down all the
 servers, or using pg_ctl -m immediate?

I use init-script, it shuts down cleanly with -m fast.

 
 -- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com
 
  + Everyone has their own god. +


--
May the force be with you…
https://simply.name



Re: [HACKERS] Combining Aggregates

2015-03-04 Thread Kouhei Kaigai
 On Wed, Mar 4, 2015 at 4:41 AM, David Rowley dgrowle...@gmail.com wrote:
  This thread mentions parallel queries as a use case, but that means
  passing data between processes, and that requires being able to
  serialize and deserialize the aggregate state somehow. For actual data
  types that's not overly difficult I guess (we can use in/out functions),
  but what about aggretates using 'internal' state? I.e. aggregates
  passing pointers that we can't serialize?
 
  This is a good question. I really don't know the answer to it as I've not
  looked at the Robert's API for passing data between backends yet.
 
  Maybe Robert or Amit can answer this?
 
 I think parallel aggregation will probably require both the
 infrastructure discussed here, namely the ability to combine two
 transition states into a single transition state, and also the ability
 to serialize and de-serialize transition states, which has previously
 been discussed in the context of letting hash aggregates spill to
 disk.  My current thinking is that the parallel plan will look
 something like this:
 
 HashAggregateFinish
 - Funnel
 - HashAggregatePartial
   - PartialHeapScan
 
 So the workers will all pull from a partial heap scan and each worker
 will aggregate its own portion of the data.  Then it'll need to pass
 the results of that step back to the master, which will aggregate the
 partial results and produce the final results.  I'm guessing that if
 we're grouping on, say, column a, the HashAggregatePartial nodes will
 kick out 2-column tuples of the form (value for a, serialized
 transition state for group with that value for a).
 
It may not be an urgent topic to be discussed towards v9.5, however,
I'd like to raise a topic about planner and aggregates.

Once we could get the two phase aggregation, planner needs to pay
attention possibility of partial aggregate during plan construction,
even though our current implementation attach Agg node after the
join/scan plan construction.

Probably, a straightforward design is to add FunnelPath with some
child nodes on set_rel_pathlist() or add_paths_to_joinrel().
Its child node may be PartialAggregate node (or some other parallel
safe node of course). In this case, it must inform the planner this
node (for more correctness, targetlist of the node) returns partial
aggregation, instead of the values row-by-row.
Then, planner need to track and care about which type of data shall
be returned to the upper node. Path node may have a flag to show it,
and we also may need to inject dummy PartialAggregate if we try to
join a relation that returns values row-by-row and another one that
returns partial aggregate.
It also leads another problem. The RelOptInfo-targetlist will
depend on the Path node chosen. Even if float datum is expected as
an argument of AVG(), its state to be combined is float[3].
So, we will need to adjust the targetlist of RelOptInfo, once Path
got chosen.

Anyway, I could find out, at least, these complicated issues around
two-phase aggregate integration with planner. Someone can suggest
minimum invasive way for these integration?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] File based Incremental backup v8

2015-03-04 Thread Bruce Momjian
On Thu, Mar  5, 2015 at 01:25:13PM +0900, Fujii Masao wrote:
  Yeah, it might make the situation better than today. But I'm afraid that
  many users might get disappointed about that behavior of an incremental
  backup after the release...
 
  I don't get what do you mean here. Can you elaborate this point?
 
 The proposed version of LSN-based incremental backup has some limitations
 (e.g., every database files need to be read even when there is no modification
 in database since last backup, and which may make the backup time longer than
 users expect) which may disappoint users. So I'm afraid that users who can
 benefit from the feature might be very limited. IOW, I'm just sticking to
 the idea of timestamp-based one :) But I should drop it if the majority in
 the list prefers the LSN-based one even if it has such limitations.

We need numbers on how effective each level of tracking will be.  Until
then, the patch can't move forward.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c

2015-03-04 Thread Michael Paquier
On Thu, Mar 5, 2015 at 12:54 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Sun, Mar 1, 2015 at 6:58 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Hm, why not. That would remove all inconsistencies between the parser
  and the autovacuum code path. Perhaps something like the attached
  makes sense then?

 I really don't see this patch, or any of the previous ones, as solving
 any actual problem.  There's no bug here, and no reason to suspect
 that future code changes would be particularly like to introduce one.
 Assertions are a great way to help developers catch coding mistakes,
 but it's a real stretch to think that a developer is going to add a
 new syntax for ANALYZE that involves setting options proper to VACUUM
 and not notice it.

 Yeah, I haven't been terribly excited about it for the same reasons.
 Had Michael's latest patch meant that we didn't need to pass VacuumStmt
 down into the other functions then I might have been a bit more behind
 it, but as is we seem to be simply duplicating everything except the
 actual Node entry in the struct, which kind of missed the point.

OK, if you folks think that there is no point to add some assertions
on the freeze params for an ANALYZE, let's move on then.

 This thread started out because Michael read an assertion in the code
 and misunderstood what that assertion was trying to guard against.
 I'm not sure there's any code change needed here at all, but if there
 is, I suggest we confine it to adding a one-line comment above that
 assertion clarifying its purpose, like /* check that parser didn't
 produce ANALYZE FULL or ANALYZE FREEZE */.

Fine for me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

2015-03-04 Thread Robert Haas
On Tue, Mar 3, 2015 at 7:14 PM, Peter Geoghegan p...@heroku.com wrote:
 My patch actually matches Andrew Gierth's datumsort patch, in that it
 also uses this convention, as I believe it should. For that reason,
 I'd prefer to make the comment added in November true, rather than
 changing the comment...I feel it ought to be true anyway (which is
 what I meant about a pre-existing issue). You'll still need the
 defenses you've added for the the hash case, of course. You might want
 to also put a comment specifying why it's necessary, since the hash
 tuple case is the oddball cases that doesn't always have sortKeys
 set.

 In other words, I suggest that you commit the union of our two
 patches, less your changes to the comments around sortKeys'.

I think we should commit my patch, and if a future patch needs
sortKeys set in more places, it can make that change itself.  There's
no reason why it's needed with the code as it is today, and no reason
to let bits of future changes leak into a bug-fix patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] forward vs backward slashes in msvc build code

2015-03-04 Thread Peter Eisentraut
On 2/15/15 6:55 AM, Michael Paquier wrote:
 I tested quickly the second patch with MS 2010 and I am getting a
 build failure: chkpass cannot complete because of crypt missing. On
 master build passes. More details here:
 C:\Users\mpaquier\git\postgres\pgsql.sln (default target) (1) -
 C:\Users\mpaquier\git\postgres\chkpass.vcxproj (default target) (36) -
 (Link target) -
   chkpass.obj : error LNK2019: unresolved external symbol crypt
 referenced in function chkpass_in
 [C:\Users\ioltas\git\postgres\chkpass.vcxproj]
   .\Release\chkpass\chkpass.dll : fatal error LNK1120: 1 unresolved
 externals [C:\Users\mpaquier\git\postgres\chkpass.vcxproj]

I can't tell from just looking at the code how chkpass would normally
find crypt.  The msvc tools neither parse SHLIB_LINK nor have hardcoded
knowledge.  Any idea?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] File based Incremental backup v8

2015-03-04 Thread Fujii Masao
On Thu, Mar 5, 2015 at 1:59 AM, Marco Nenciarini
marco.nenciar...@2ndquadrant.it wrote:
 Hi Fujii,

 Il 03/03/15 11:48, Fujii Masao ha scritto:
 On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 Il 02/03/15 14:21, Fujii Masao ha scritto:
 On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 Hi,

 I've attached an updated version of the patch.

 basebackup.c:1565: warning: format '%lld' expects type 'long long
 int', but argument 8 has type '__off_t'
 basebackup.c:1565: warning: format '%lld' expects type 'long long
 int', but argument 8 has type '__off_t'
 pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code


 I'll add the an explicit cast at that two lines.

 When I applied three patches and compiled the code, I got the above 
 warnings.

 How can we get the full backup that we can use for the archive recovery, 
 from
 the first full backup and subsequent incremental backups? What commands 
 should
 we use for that, for example? It's better to document that.


 I've sent a python PoC that supports the plain format only (not the tar 
 one).
 I'm currently rewriting it in C (with also the tar support) and I'll send a 
 new patch containing it ASAP.

 Yeah, if special tool is required for that purpose, the patch should include 
 it.


 I'm working on it. The interface will be exactly the same of the PoC script 
 I've attached to

 54c7cdad.6060...@2ndquadrant.it

 What does 1 of the heading line in backup_profile mean?


 Nothing. It's a version number. If you think it's misleading I will remove 
 it.

 A version number of file format of backup profile? If it's required for
 the validation of backup profile file as a safe-guard, it should be included
 in the profile file. For example, it might be useful to check whether
 pg_basebackup executable is compatible with the source backup that
 you specify. But more info might be needed for such validation.


 The current implementation bail out with an error if the header line is 
 different from what it expect.
 It also reports and error if the 2nd line is not the start WAL location. 
 That's all that pg_basebackup needs to start a new incremental backup. All 
 the other information are useful to reconstruct a full backup in case of an 
 incremental backup, or maybe to check the completeness of an archived full 
 backup.
 Initially the profile was present only in incremental backups, but after some 
 discussion on list we agreed to always write it.

Don't we need more checks about the compatibility of the backup-target database
cluster and the source incremental backup? Without such more checks, I'm afraid
we can easily get a corrupted incremental backups. For example, pg_basebackup
should emit an error if the target and source have the different system IDs,
like walreceiver does? What happens if the timeline ID is different between the
source and target? What happens if the source was taken from the standby but
new incremental backup will be taken from the master? Do we need to check them?

 Sorry if this has been already discussed so far. Why is a backup profile 
 file
 necessary? Maybe it's necessary in the future, but currently seems not.

 It's necessary because it's the only way to detect deleted files.

 Maybe I'm missing something. Seems we can detect that even without a profile.
 For example, please imagine the case where the file has been deleted since
 the last full backup and then the incremental backup is taken. In this case,
 that deleted file exists only in the full backup. We can detect the deletion 
 of
 the file by checking both full and incremental backups.


 When you take an incremental backup, only changed files are sent. Without the 
 backup_profile in the incremental backup, you cannot detect a deleted file, 
 because it's indistinguishable from a file that is not changed.

Yeah, you're right!

 We've really gotten the consensus about the current design, especially that
 every files basically need to be read to check whether they have been 
 modified
 since last backup even when *no* modification happens since last backup?

 The real problem here is that there is currently no way to detect that a 
 file is not changed since the last backup. We agreed to not use file system 
 timestamps as they are not reliable for that purpose.

 TBH I prefer timestamp-based approach in the first version of incremental 
 backup
 even if's less reliable than LSN-based one. I think that some users who are
 using timestamp-based rsync (i.e., default mode) for the backup would be
 satisfied with timestamp-based one.

 The original design was to compare size+timestamp+checksums (only if 
 everything else matches and the file has been modified after the start of the 
 backup), but the feedback from the list was that we cannot trust the 
 filesystem mtime and we must use LSN instead.


 Using LSN have a significant advantage over using checksum, as we can start 

Re: [HACKERS] Reduce pinning in btree indexes

2015-03-04 Thread Kyotaro HORIGUCHI
Hello,

 It looks like the remaining performance regression was indeed a
 result of code alignment.  I found two paranoia assignments I had
 accidentally failed to put back with the rest of the mark/restore
 optimization; after that trivial change the patched version is
 ever-so-slightly faster than master on all tests.

The performance which your test shows looks great. The gain might
comes from removing of buffer locking on _bt_next. I also would
like to see this to come along with 9.5 but it is the matter for
committers.

Apart from it, I looked into the patch itself more
closely. Please reconcile as you like among contradicting
comments below:)


In nbtutils.c, _bt_killitems:

- This is not in charge of this patch, but the comment for
  _bt_killitems looks to have a trivial typo.

* _bt_killitems - set LP_DEAD state for items an indexscan caller has
* told us were killed
*
* scan-so contains information about the current page and killed tuples
* thereon (generally, this should only be called if so-numKilled  0).

  I suppose scan-so should be scan-opaque and
  so-numKilled be opaque-numKilled.


- The following comment looks somewhat wrong.

   /* Since the else condition needs page, get it here, too. */

  the else condition needs page means the page of the buffer
  is needed later? But, I suppose the comment itself is not
  necessary.

- If (BTScanPosIsPinned(so-currPos)).

  As I mention below for nbtsearch.c, the page aquired in the
  if-then block may be vacuumed so the LSN check done in the
  if-else block is need also in the if-then block. It will be
  accomplished only by changing the position of the end of the
  if-else block.

- _bt_killitems is called without pin when rescanning from
  before, so I think the previous code has already considered the
  unpinned case (if (ItemPointerEquals(ituple... does
  that). Vacuum rearranges line pointers after deleting LP_DEAD
  items so the previous check seems enough for the purpose. The
  previous code is more effeicient becuase the mathing items will
  be processed even after vacuum.


In nbtsearch.c

- _bt_first(): It now releases the share lock on the page before
  the items to be killed is processed. This allows other backends
  to insert items into the page simultaneously. It seems to be
  dangerous alone, but _bt_killitems already considers the
  case. Anyway I think It'll be better to add a comment
  mentioning unlocking is not dangerous.


... Sorry, time's up for now.

regards,


 Average of 3 `make check-world` runs:
 master: 336.288 seconds
 patch:  332.237 seconds
 
 Average of 6 `make check` runs:
 master: 25.409 seconds
 patch:  25.270 seconds
 
 The following were all run 12 times, the worst 2 dropped, the rest
 averaged:
 
 Kyotaro-san's 1m mark worst case benchmark:
 master: 962.581 ms, 6.765 stdev
 patch:  947.518 ms, 3.584 stdev
 
 Kyotaro-san's 500k mark, 500k restore worst case benchmark:
 master: 1366.639 ms, 5.314 stdev
 patch:  1363.844 ms, 5.896 stdev
 
 pgbench -i -s 16 pgbench / pgbench -c 16 -j 4 -T 500 pgbench
 master: 265.011 tps, 4.952 stdev
 patch:  267.164 tps, 9.094 stdev
 
 How do people feel about the idea of me pushing this for 9.5 (after
 I clean up all the affected comments and README files)?  I know
 this first appeared in the last CF, but the footprint is fairly
 small and the only user-visible behavior change is that a btree
 index scan of a WAL-logged table using an MVCC snapshot no longer
 blocks a vacuum indefinitely.  (If there are objections I will move
 this to the first CF for the next release.)
 
  src/backend/access/nbtree/nbtree.c|  50 +---
  src/backend/access/nbtree/nbtsearch.c | 141 
 +-
  src/backend/access/nbtree/nbtutils.c  |  58 ++
  src/include/access/nbtree.h   |  36 -
  4 files changed, 201 insertions(+), 84 deletions(-)

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join push-down support for foreign tables

2015-03-04 Thread Etsuro Fujita

On 2015/03/03 21:34, Shigeru Hanada wrote:

I rebased join push-down patch onto Kaigai-san's Custom/Foreign Join
v6 patch.


Thanks for the work, Hanada-san and KaiGai-san!

Maybe I'm missing something, but did we agree to take this approach, ie, 
join push-down on top of custom join?  There is a comment ahout that 
[1].  I just thought it'd be better to achieve a consensus before 
implementing the feature further.



but still the patch
has an issue about joins underlying UPDATE or DELETE.  Now I'm working
on fixing this issue.


Is that something like UPDATE foo ... FROM bar ... where both foo and 
bar are remote?  If so, I think it'd be better to push such an update 
down to the remote, as discussed in [2], and I'd like to work on that 
together!


Sorry for having been late for the party.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/23343.1418658...@sss.pgh.pa.us
[2] http://www.postgresql.org/message-id/31942.1410534...@sss.pgh.pa.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >