date:20140311

Magnus Hagander mag...@hagander.net writes:
 On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
 It will be 12 years this year since this temporary measure was
 added.  I'm just wondering, is there any complete solution that
 anyone had in mind yet?  Or should this just be deprecated?

 I'd say +1 to remove it. That would also make it possible to get id of
 password authentication...

If we remove it without providing a substitute feature, people who are
using it will rightly complain.

Are you claiming there are no users, and if so, on what evidence?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Magnus Hagander

On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
  It will be 12 years this year since this temporary measure was
  added.  I'm just wondering, is there any complete solution that
  anyone had in mind yet?  Or should this just be deprecated?

  I'd say +1 to remove it. That would also make it possible to get id of
  password authentication...

 If we remove it without providing a substitute feature, people who are
 using it will rightly complain.

 Are you claiming there are no users, and if so, on what evidence?


I am claiming that I don't think anybody is using that, yes.

Based on the fact that I have *never* come across it on any system I've
come across since, well, forever. Except once I think, many years ago, when
someone had enabled it by mistake and needed my help to remove it...

But we should absolutely deprecate it first in that place. Preferrably
visibly (e.g. with a log message when people use it). That could at least
get those people who use it to let us know they do, to that way figure out
if they do - and can de-deprecate it.

Or if someone wants to fix it properly of course :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Is SPI safe to use in multi-threaded PL/Java?

MauMau maumau...@gmail.com writes:
 From: Tom Lane t...@sss.pgh.pa.us
 When it breaks, we're not going to be concerned.

 I may not understand your nuance.  Which of the following do you mean?

 * PL/Java's design is dangerous in terms of the mixture of single- and 
 multi-threading, and we cannot be 100% sure whether there's really no 
 problem.

That, more or less.  There is exactly zero provision in the Postgres
code for multiple threads to exist inside a backend process.  It's
possible that PL/Java manages to completely insulate the Java world
from the C world, so that the C code never sees more than one thread.
But any leakage at all in that abstraction is probably going to cause
bugs; and as I said, we (PG hackers) are not going to consider such
bugs to be our problem.

On platforms where the standard libc supports threading (which is most,
these days), I'd be particularly worried about leakage along the path
java - libc - postgres.  If libc becomes aware that there are multiple
threads executing inside the process, it's likely to change behaviors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

Magnus Hagander mag...@hagander.net writes:
 On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Are you claiming there are no users, and if so, on what evidence?

 I am claiming that I don't think anybody is using that, yes.

 Based on the fact that I have *never* come across it on any system I've
 come across since, well, forever. Except once I think, many years ago, when
 someone had enabled it by mistake and needed my help to remove it...

A slightly more scientific basis for that would be to ask on
pgsql-general.

 Or if someone wants to fix it properly of course :)

Yeah, that's what we've been hoping for for 12 years.  I stopped holding
my breath awhile ago.

Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
has been.  I'm just expecting lots of push-back if we try.  And it's kind
of hard to resist push-back when you don't have a substitute to offer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

Rajeev rastogi rajeev.rast...@huawei.com writes:
 On 10 March 2014 23:44, Tom Lane wrote:
 Unfortunately, while testing it I noticed that there's a potentially
 fatal backwards-compatibility problem, namely that the COPY n status
 gets printed on stdout, which is the same place that COPY OUT data is
 going.  While this isn't such a big problem for interactive use, usages
 like this one are pretty popular:
 
 psql -c 'copy mytable to stdout' mydatabase | some-program
 
 With the patch, COPY n gets included in the data sent to some-program,
 which never happened before and is surely not what the user wants.
 The same if the -c string uses \copy.

 Is it OK to have different status output for different flavor of COPY 
 command? 
 I am afraid that It will become kind of inconsistent result.

Well, that's the big question here.

We already do have different status output for different kinds of COPY,
ie we don't report it for COPY FROM STDIN/TO STDOUT.  What now emerges
is that there's a good reason for the omission in the case of TO STDOUT.
I certainly hadn't remembered that, and there's no documentation of it
in either code comments or the SGML docs.

After sleeping on it, I'm inclined to think we should continue to not
print status for COPY TO STDOUT.  Aside from the risk of breaking scripts,
there's a decent analogy to be made to SELECT: we don't print a status
tag for that either.

That leaves the question of whether we want to start printing a tag for
the COPY FROM STDIN case.  I don't think that'd create much risk of
breaking anything, and the analogy to SELECT doesn't hold either.
OTOH, Robert opined upthread that FROM STDIN and TO STDOUT shouldn't
behave differently; does that argument still impress anyone?  And given
that different COPY cases are still going to behave differently, maybe
we should just stick with the status quo.  It's been like this for a
mighty long time with few complaints.

In any case, some documentation and code comment changes would be
appropriate ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure



On 03/11/2014 09:57 AM, Tom Lane wrote:

Magnus Hagander mag...@hagander.net writes:

On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Are you claiming there are no users, and if so, on what evidence?

I am claiming that I don't think anybody is using that, yes.
Based on the fact that I have *never* come across it on any system I've
come across since, well, forever. Except once I think, many years ago, when
someone had enabled it by mistake and needed my help to remove it...

A slightly more scientific basis for that would be to ask on
pgsql-general.


Or if someone wants to fix it properly of course :)

Yeah, that's what we've been hoping for for 12 years.  I stopped holding
my breath awhile ago.

Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
has been.  I'm just expecting lots of push-back if we try.  And it's kind
of hard to resist push-back when you don't have a substitute to offer.





Or we try to make it work. I don't think the idea is inherently bad, and 
I know there are people (like ISPs) who would like to have it work 
properly. Maybe in these days when most people are on dedicated VMs this 
matters less, but I don't think shared database servers are totally dead 
yet.


The docs say:

   db_user_namespace causes the client's and server's user name
   representation to differ. Authentication checks are always done with
   the server's user name so authentication methods must be configured
   for the server's user name, not the client's. Because md5 uses the
   user name as salt on both the client and server, md5 cannot be used
   with db_user_namespace.

Is that the only major issue? Why not have the server strip out the @db 
part if this is on? If we made this an initdb-time setting rather than a 
GUC then we'd remove the problems caused by turning this on and off. I'm 
not sure what other problems that might cause, but it doesn't seem 
totally intractable at first glance.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Stephen Frost

* Andrew Dunstan (and...@dunslane.net) wrote:
 Or we try to make it work. I don't think the idea is inherently bad,
 and I know there are people (like ISPs) who would like to have it
 work properly. Maybe in these days when most people are on dedicated
 VMs this matters less, but I don't think shared database servers are
 totally dead yet.

Agreed.  There are certainly pretty big hosting companies out there
which are already doing multi-tenant PG, but they're using their own
approaches instead of anything we provide (because what we provide
sucks, basically..).

 The docs say:
 
db_user_namespace causes the client's and server's user name
representation to differ. Authentication checks are always done with
the server's user name so authentication methods must be configured
for the server's user name, not the client's. Because md5 uses the
user name as salt on both the client and server, md5 cannot be used
with db_user_namespace.
 
 Is that the only major issue? Why not have the server strip out the
 @db part if this is on? If we made this an initdb-time setting
 rather than a GUC then we'd remove the problems caused by turning
 this on and off. I'm not sure what other problems that might cause,
 but it doesn't seem totally intractable at first glance.

Isn't the other issue for ISPs essentially that we don't have row-level
security for our global catalogs?  as in- we can't limit what's in
pg_authid to only those entries a given user should be able to see?  I
don't think db_user_namespace addresses that issue (but I didn't go look
either).

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] db_user_namespace a temporary measure



On 03/11/2014 12:37 PM, Stephen Frost wrote:


Isn't the other issue for ISPs essentially that we don't have row-level
security for our global catalogs?  as in- we can't limit what's in
pg_authid to only those entries a given user should be able to see?  I
don't think db_user_namespace addresses that issue (but I didn't go look
either).





Possibly, but we don't have to solve every problem at once. As the 
Sermon on the Mount says, Sufficient unto the day is the evil thereof.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Retain dynamic shared memory segments for postmaster lifetime

2014-03-11 Thread Robert Haas

On Mon, Mar 10, 2014 at 11:26 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Mar 10, 2014 at 11:37 PM, Robert Haas robertmh...@gmail.com wrote:
 Looks good, committed.  However, I changed it so that
 dsm_keep_segment() does not also perform the equivalent of
 dsm_keep_mapping(); those are two separate operations.

 So are you expecting that if some one needs to retain dynamic segment's
 till PM lifetime, they should call both dsm_keep_segment() and
 dsm_keep_mapping()?

If they want to keep both the mapping and the segment, yes.  But in
general those two things are independent of each other.  A process
could want to map the segment and store some data in it, and then it
could want to unmap the segment; and then later the segment could be
mapped again (perhaps from some other backend) to get the data out.

 If we don't call both, it can lead to following warning:
 postgres=# select dsm_demo_create('this message is from session-new', 1);
 WARNING:  dynamic shared memory leak: segment 1402373971 still referenced

Well, that's just an artifact of the coding of dsm_demo_create().
Code that doesn't use dsm_keep_mapping() needs to be sure to call
dsm_detach() in the non-error path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is AccessShareLock held until end of transaction?

On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Joe Conway m...@joeconway.com writes:
 I am probably missing something obvious, but why does the
 AccessShareLock remain held on a table after a SELECT statement is
 complete when in a transaction block?

 *Any* lock acquired by user command is held till end of transaction;
 AccessShareLock isn't special.

 In general, releasing early would increase the risk of undesirable
 behaviors such as tables changing definition mid-transaction.

I thought good question at first, but the workaround is simple...
just don't use multi-step transactions, submit each request as a
separate transaction.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Alvaro Herrera

Tom Lane escribió:
 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  You are correct. pg_dump export reloptions using WITH clause of CREATE
  TABLE statement. I.e.:
 
  CREATE TABLE foo (
  )
  WITH (autovacuum_enabled=false, bdr.do_replicate=false);
 
  So if this statement checks for 'bdr' extension is loaded then in partial
  restore it can be fail.
 
 I see absolutely *nothing* wrong with failing that command if bdr is not
 installed.  For an analogy, if this table includes a column of type bar
 defined by some extension baz, we are certainly going to fail the
 CREATE TABLE if baz isn't installed.
 
 Now, if bdr is installed but the validation doesn't happen unless bdr
 is loaded in some sense, then that is an implementation deficiency
 that I think we can insist be rectified before this feature is accepted.

So, I spent some time on this patch the last couple of days to add some
validation.  But after submitting it, it seems to me that there wasn't
as much consensus on how to handle validation than at first I thought.

So, the first and simplest way to go about this, of course, is just not
validate anything. This is what Fabrízio's patch does.  So the table
owner can execute
  ALTER TABLE mary SET (supercalifragilisticexpialidocious.umbrella_flight = 
'hell yeah')
and that would be it.  Whether a module makes use of this later or not,
is not that guy's problem.  This is mostly what we do for GUCs, note, so
it's not exactly groundbreaking.

As a second possibility, my patch as posted allows one to register a
namespace.  So pg_dump can do this:
  SELECT pg_register_option_namespace('supercalifragilisticexpialidocious');
and then create the table just like the above ALTER TABLE.  Note that
the variable name, and the value, are not checked until later.  If a
module comes later and says hey, I own that super- option namespace,
and I have option umbrella_flight but it's a boolean (by calling
add_bool_reloption), that will raise an error.  Note that in my patch as
posted, if you set the parameter umbrella_flight='better not' to an
index, but the parameter has only been defined for tables
(RELOPT_KIND_HEAP), it will be silently accepted.  Also note that we can
add a function equivalent to EmitWarningOnPlaceholders (Andres' idea),
so that any unrecognized option will cause some noise and it can be
identified right away.  Since only table owners can set options, this
seems more than good to me; it's not like table owners are going to mess
up by adding pointless options just for the heck of it.


A further possibility requires modules to also register options (not
only namespaces), and to validate each and every option as soon as it's
created.  So if you try to set an option that has not been previously
registered by a module, that will raise an error right there.  This
seems to be what Tom, Robert and Peter want.  However, getting there
seems very laborious; apparently, we will need a new system catalog to
register option validators, for starters.  We'll also need a way to load
the module whenever a table gets an option in a not-loaded module.  (I
think this will fall off automatically if the validator is registered,
because when the validator is called, the module is loaded by the
system).

One slight problem with this is what to do with extensions that don't
provide any C code.  Some use cases require options that can be set and
accessed only from SQL code.


My question here for the hackers community at large is this: are we okay
with implementing the second option I propose above?  If we are, then I
will see about finalizing the patch and getting it committed.  If we are
not, and we're determined that only the third option is acceptable, I
will jump out of this thread and stop wasting everyone's time.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Atri Sharma

On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
  Joe Conway m...@joeconway.com writes:
  I am probably missing something obvious, but why does the
  AccessShareLock remain held on a table after a SELECT statement is
  complete when in a transaction block?
 
  *Any* lock acquired by user command is held till end of transaction;
  AccessShareLock isn't special.
 
  In general, releasing early would increase the risk of undesirable
  behaviors such as tables changing definition mid-transaction.

 I thought good question at first, but the workaround is simple...
 just don't use multi-step transactions, submit each request as a
 separate transaction.


 Wouldnt that tend to get inefficient?

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*

Re: [HACKERS] Why is AccessShareLock held until end of transaction?

On 11 March 2014 17:29, Atri Sharma atri.j...@gmail.com wrote:



 On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
  Joe Conway m...@joeconway.com writes:
  I am probably missing something obvious, but why does the
  AccessShareLock remain held on a table after a SELECT statement is
  complete when in a transaction block?
 
  *Any* lock acquired by user command is held till end of transaction;
  AccessShareLock isn't special.
 
  In general, releasing early would increase the risk of undesirable
  behaviors such as tables changing definition mid-transaction.

 I thought good question at first, but the workaround is simple...
 just don't use multi-step transactions, submit each request as a
 separate transaction.


 Wouldnt that tend to get inefficient?

Please outline your alternate proposal so we can judge the comparative
efficiency.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Atri Sharma

On Tue, Mar 11, 2014 at 11:07 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 17:29, Atri Sharma atri.j...@gmail.com wrote:
 
 
 
  On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
  On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
   Joe Conway m...@joeconway.com writes:
   I am probably missing something obvious, but why does the
   AccessShareLock remain held on a table after a SELECT statement is
   complete when in a transaction block?
  
   *Any* lock acquired by user command is held till end of transaction;
   AccessShareLock isn't special.
  
   In general, releasing early would increase the risk of undesirable
   behaviors such as tables changing definition mid-transaction.
 
  I thought good question at first, but the workaround is simple...
  just don't use multi-step transactions, submit each request as a
  separate transaction.
 
 
  Wouldnt that tend to get inefficient?

 Please outline your alternate proposal so we can judge the comparative
 efficiency.



I dont have an alternate proposal yet. I was just wondering if per step
transactions could lead to a drop in performance.

If that is the best way to go, I am all for it.

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*

Re: [HACKERS] [PATCH] Store Extension Options

On 11 March 2014 17:26, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Tom Lane escribió:
 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  You are correct. pg_dump export reloptions using WITH clause of CREATE
  TABLE statement. I.e.:

  CREATE TABLE foo (
  )
  WITH (autovacuum_enabled=false, bdr.do_replicate=false);

  So if this statement checks for 'bdr' extension is loaded then in partial
  restore it can be fail.

 I see absolutely *nothing* wrong with failing that command if bdr is not
 installed.  For an analogy, if this table includes a column of type bar
 defined by some extension baz, we are certainly going to fail the
 CREATE TABLE if baz isn't installed.

 Now, if bdr is installed but the validation doesn't happen unless bdr
 is loaded in some sense, then that is an implementation deficiency
 that I think we can insist be rectified before this feature is accepted.

 So, I spent some time on this patch the last couple of days to add some
 validation.  But after submitting it, it seems to me that there wasn't
 as much consensus on how to handle validation than at first I thought.

 So, the first and simplest way to go about this, of course, is just not
 validate anything. This is what Fabrízio's patch does.  So the table
 owner can execute
   ALTER TABLE mary SET (supercalifragilisticexpialidocious.umbrella_flight = 
 'hell yeah')
 and that would be it.  Whether a module makes use of this later or not,
 is not that guy's problem.  This is mostly what we do for GUCs, note, so
 it's not exactly groundbreaking.

If a module fails to use a parameter that may be a problem. But
forcing us to validate this using user written code may not improve
the situation.

What happens if I have two extensions that both use the namespace foo?
That means we would run two validation routines on it, and if they
disagree on the set of options and values then we are hosed.

-1 to *requiring* validation for table-level options for exactly the
same reasons we no longer validate custom GUCs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/11/2014 12:26 PM, Simon Riggs wrote:
 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Joe Conway m...@joeconway.com writes:
 I am probably missing something obvious, but why does the 
 AccessShareLock remain held on a table after a SELECT statement
 is complete when in a transaction block?
 
 *Any* lock acquired by user command is held till end of
 transaction; AccessShareLock isn't special.
 
 In general, releasing early would increase the risk of
 undesirable behaviors such as tables changing definition
 mid-transaction.
 
 I thought good question at first, but the workaround is
 simple... just don't use multi-step transactions, submit each
 request as a separate transaction.

Yeah, I told them that already. Unfortunately in this environment it
is not an option. It isn't a huge problem, but I did find it
surprising (as did the client) that a purely read-only transaction
could cause a deadlock with a concurrent CREATE TABLE.

It would seem that once the SELECT statement has finished we could
drop the AccessShareLock, but I guess that would open a can of works
that we don't want to contemplate.

Joe


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTH0tlAAoJEDfy90M199hlxtUP/isxPT8ZRPf8X/vM3+vS4XR2
CTwNB292c9TLADSfi4lHFCXu8kqOpx29/9PJHUHrhTrCQE10USdC5uBN04u9si0a
SL5cmwtSeSn3YacgksNpPz0u9spGVdO4XqcMq9oh5gcsSeRf14NXIPAvUk7yRPTA
leVo7CArOfyld0QdRNw3JP50tAoHYJQynomkClg/9U+jYtk/aBpCSe/KL++d5esl
xt8iGZQ/wdZu+vWSdeaJMvGUYNOu4ts7wgtrqvLv9qLXDAiftfIC6NuakKY3WHY6
2OYz64Xd+wH0ZWEhYnSjkQR354RXSm0JQNos02nAjviDON6r6OJk3ny7Rw/mKbAw
ZR2Ze3EFYcnMeV9Rrg1DccDzqWK9lq7tHD++IfbQ/36xvOcxh4pQuZQt9erTJ4q1
l9MrHE8PA4mVDgcGlhcdzDl+/po/0ghy/HWgH72NjGpEX+fChh7Pad9ZCO5r33Du
V3EZXfdLwnokx/VRi0N61ZeBJCCKWSST3SrZKJk5ao7y8dQPIICryLJlM9sTxlXf
2wiQlybElpaqWxy+Ou3M7EYdPvGNOLHMCt8yUK5n+RFTEtljKNwy1E9NvJWWiVl9
SfA/6GXXsGlO0rQ723R1vPAFHtTo82ibQaiCNujVPu/2yecKl4MsdtaZApkilLqx
EPoWWGrs3cURvar6gmju
=DOcV
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cleaner build output when not much has changed

2014-03-11 Thread Gurjeet Singh

On Mon, Mar 10, 2014 at 8:12 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Gurjeet Singh wrote:
  On Tue, Nov 26, 2013 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
   Gurjeet Singh gurj...@singh.im writes:
I was looking for ways to reduce the noise in Postgres make output,
specifically, I wanted to eliminate the Nothing to be done for
 `all' 
messages, since they don't add much value, and just ad to the
 clutter.
  
   Why don't you just use make -s if you don't want to see that?
   The example output you show is not much less verbose than before.
 
  I have a shell function that now adds --no-print-directory to my make
  command. This patch combined with that switch makes the output really
 clean
  (at least from my perspective). Since the use of a command-line switch
 can
  be easily left to personal choice, I am not proposing to add that or its
  makefile-equivalent. But modifying the makefiles to suppress noise is not
  that everyone can be expected to do, and do it right.

 FWIW you can add a src/Makefile.custom file with this:

 all:
 @true

 and it will get rid of most noise.


As I noted in the first email in this chain, this causes a warning:

GNUmakefile:14: warning: overriding commands for target `all'
/home/gurjeet/dev/POSTGRES/src/Makefile.custom:2: warning: ignoring old
commands for target `all'

I have since settled for `make -s`. On slow builds it keeps me guessing for
a long time, without any indication of progress, but I've learned to live
with that.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com http://www.enterprisedb.com

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Fabrízio de Royes Mello

On Tue, Mar 11, 2014 at 2:43 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 17:26, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
  Tom Lane escribió:
  =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
   You are correct. pg_dump export reloptions using WITH clause of
CREATE
   TABLE statement. I.e.:
 
   CREATE TABLE foo (
   )
   WITH (autovacuum_enabled=false, bdr.do_replicate=false);
 
   So if this statement checks for 'bdr' extension is loaded then in
partial
   restore it can be fail.
 
  I see absolutely *nothing* wrong with failing that command if bdr is
not
  installed.  For an analogy, if this table includes a column of type bar
  defined by some extension baz, we are certainly going to fail the
  CREATE TABLE if baz isn't installed.
 
  Now, if bdr is installed but the validation doesn't happen unless bdr
  is loaded in some sense, then that is an implementation deficiency
  that I think we can insist be rectified before this feature is
accepted.
 
  So, I spent some time on this patch the last couple of days to add some
  validation.  But after submitting it, it seems to me that there wasn't
  as much consensus on how to handle validation than at first I thought.
 
  So, the first and simplest way to go about this, of course, is just not
  validate anything. This is what Fabrízio's patch does.  So the table
  owner can execute
ALTER TABLE mary SET
(supercalifragilisticexpialidocious.umbrella_flight = 'hell yeah')
  and that would be it.  Whether a module makes use of this later or not,
  is not that guy's problem.  This is mostly what we do for GUCs, note, so
  it's not exactly groundbreaking.

 If a module fails to use a parameter that may be a problem. But
 forcing us to validate this using user written code may not improve
 the situation.

 What happens if I have two extensions that both use the namespace foo?
 That means we would run two validation routines on it, and if they
 disagree on the set of options and values then we are hosed.

 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.


In a previous email [1] I asked about alternatives to drive the work but
unfortunately no one replied. So because we already do that to custom GUCs,
and is the simpler way to implement this feature then I did that.

Regards,

[1]
http://www.postgresql.org/message-id/cafcns+r1zxtruzleceuj1sf9qr6ciks7he-esmkzoznh4nx...@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] [PATCH] Store Extension Options

Simon Riggs si...@2ndquadrant.com writes:
 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.

Well, that is an interesting analogy, but I'm not sure how much it applies
here.  In the case of a GUC, you can fairly easily validate it once the
module does get loaded (and before the module actually tries to do
anything with it).  I don't see how that's going to work for table
options.  I trust nobody is seriously proposing that on module load, we're
going to scan the whole of pg_class looking to see if there are incorrect
settings.  (Even if we did, what would we do about it?  Not try to force a
pg_class update, for sure.  And what if the module is loading into the
postmaster thanks to a preload spec?)

I don't really think partial validation makes sense.  We could just remove
the whole topic, and tell extension authors that it's up to them to defend
themselves against bizarre values stored for their table options.  But I'm
wondering if there's really so much use-case for a feature like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] The case against multixact GUCs

2014-03-11 Thread Josh Berkus

Hackers,

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing.  This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix.  This was a mistake.

What makes these GUCs worse is that nobody knows how to set them; nobody
on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
users (or 1 in 10 people on this list) know what a multixact *is*.

Further, there's no clear justification why these cannot be set to be
the same as our other freeze ages (which our users also don't
understand), or a constant calculated portion of them, or just a
constant.  Since nobody anticipated someone adding a GUC in a minor
release, there was no discussion of this topic that I can find; the new
GUCs were added as a side effect of fixing the multixact vacuum issue.
 Certainly I would have raised a red flag if the discussion of the new
GUCs hadn't been buried deep inside really long emails.

Adding new GUCs which nobody has any idea how to set, or can even
explain to new users, is not a service to our users.  These should be
removed.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical decoding documentation?

2014-03-11 Thread Peter Eisentraut

Where, if anywhere, is the current documentation for writing or using a
logical decoding output plugin consumer thingy?

I'm trying to find my way into it ...

src/backend/replication/logical/logical.c, which textually contains most
of the functions that appear to interact with the test_decoding module,
contains this in the header comment:


The idea is that a consumer provides three callbacks, one to read WAL,
one to prepare a data write, and a final one for actually writing since
their implementation depends on the type of consumer.  Check
logicalfunc.c for an example implementations of a fairly simple consumer
and a implementation of a WAL reading callback that's suitable for
simpler consumers.


There is no file logicalfunc.c.  And test_decoding actually uses five
callbacks, not three.

Is a consumer the same as a decoder?

test_decoding.c contains this:

/* These must be available to pg_dlsym() */
static void pg_decode_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt, bool is_init);
...

which is surely wrong.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding documentation?

2014-03-11 Thread Andres Freund

Hi,

On 2014-03-11 15:57:39 -0400, Peter Eisentraut wrote:
 Where, if anywhere, is the current documentation for writing or using a
 logical decoding output plugin consumer thingy?

There's a pending patch for it. The corresponding commit is
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commit;h=5eeedd55b2d7e53b5fdcdab6a8e74bb666d75bcc

I welcome feedback about it. I've spent a fair bit of time immersed in
this stuff, and I am not really sure anymore what's understandable and
whatnot ;)

It's referencing pg_recvlogical which isn't committed yet (that's the
commit just before), that's why the docs weren't committed with the
feature itself.

 src/backend/replication/logical/logical.c, which textually contains most
 of the functions that appear to interact with the test_decoding module,
 contains this in the header comment:
 
 
 The idea is that a consumer provides three callbacks, one to read WAL,
 one to prepare a data write, and a final one for actually writing since
 their implementation depends on the type of consumer.  Check
 logicalfunc.c for an example implementations of a fairly simple consumer
 and a implementation of a WAL reading callback that's suitable for
 simpler consumers.
 
 
 There is no file logicalfunc.c.

Hrmpf: There's a missing 's', it's logicalfuncs.c.

 And test_decoding actually uses five callbacks, not three.

The callbacks logicalfuncs.c header comment is talking about adding a
new method to output data. Currently you can stream out changes via
walsender and via the SQL SRFs. But it might be interesting to
e.g. consume the changes in a bgworker, without going through either SQL
or walsender. To do that you need the three callbacks referenced above.

 Is a consumer the same as a decoder?

A consumer is just the recipient of the changestream. I.e the walsender
that streams out the changestream, or the SRF that spills the data into
a tuplestore.

I don't think the term decoder is used anywhere, but if it is, it'd
be the output plugin.

 test_decoding.c contains this:
 
 /* These must be available to pg_dlsym() */
 static void pg_decode_startup(LogicalDecodingContext *ctx,
 OutputPluginOptions *opt, bool is_init);
 ...
 
 which is surely wrong.

Hm, these days the comment should be above _PG_init() and
_PG_output_plugin_init(). That changed around several times...

Could you perhaps commit the attached patch fixing the issues you
mentioned?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/test_decoding/test_decoding.c b/contrib/test_decoding/test_decoding.c
index e356c7c..31aa012 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -33,6 +33,7 @@
 
 PG_MODULE_MAGIC;
 
+/* These must be available to pg_dlsym() */
 extern void		_PG_init(void);
 extern void		_PG_output_plugin_init(OutputPluginCallbacks *cb);
 
@@ -43,7 +44,6 @@ typedef struct
 	bool		include_timestamp;
 } TestDecodingData;
 
-/* These must be available to pg_dlsym() */
 static void pg_decode_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 			  bool is_init);
 static void pg_decode_shutdown(LogicalDecodingContext *ctx);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 13a22d4..04e407a 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -12,14 +12,17 @@
  *together provide logical decoding, primarily by providing so
  *called LogicalDecodingContexts. The goal is to encapsulate most of the
  *internal complexity for consumers of logical decoding, so they can
- *create and consume a changestream with a low amount of code.
+ *create and consume a changestream with a low amount of code. Builtin
+ *consumers are the walsender and SQL SRF interface, but it's possible to
+ *add further ones without changing core code, e.g. to consume changes in
+ *a bgworker.
  *
  *The idea is that a consumer provides three callbacks, one to read WAL,
  *one to prepare a data write, and a final one for actually writing since
  *their implementation depends on the type of consumer.  Check
- *logicalfunc.c for an example implementations of a fairly simple consumer
+ *logicalfuncs.c for an example implementation of a fairly simple consumer
  *and a implementation of a WAL reading callback that's suitable for
- *simpler consumers.
+ *simple consumers.
  *-
  */
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal (9.5) : psql unicode border line styles

2014-03-11 Thread Pavel Stehule

Hello

I had to reduce allowed line style to single or double, because unicode
allows only combination single,double or single,thick

postgres=# \l
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
(3 rows)

postgres=# \pset border 2
Border style (border) is 2.
postgres=# \pset linestyle unicode
Line style (linestyle) is unicode.
postgres=# \l
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
├───┼──┼──┼─┼─┼───┤
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=# \pset unicode_header_linestyle double
Unicode border linestyle is double.
postgres=# \l
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
╞═══╪══╪══╪═╪═╪═══╡
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=#

Regards

Pavel



2014-03-07 19:24 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 I am returning back to this topic. Last time I proposed styles:


 http://www.postgresql.org/message-id/cafj8prclgoktryjpbtoncpgyftrcz-zgfowdc1jqulb+ede...@mail.gmail.com

 http://postgres.cz/wiki/Pretty_borders_in_psql

 This experiment fails, but there are some interesting tips in discuss.

 So I propose little bit different proposal - choose one predefined style
 for any table lines elements. These styles are active only when linestyle
 is unicode.

 So possible line elements are:

 * border,
 * header_separator,
 * row_separator,
 * column_separator,

 Possible styles (for each element)

 * none,
  * single,
 * double,
 * thick,

 It should to have enough variability to define all styles proposed early.
 I hope, so this proposal is secure and simple for usage. Styles should be
 persistently saved in .psqlrc file - and some examples can be in
 documentation.

 Usage:

 \pset linestyle_border double
 \pset linestyle_header_separator single
 \pset linestyle_row_separator single
 \pset linestyle_column_separator single

 \pset linestyle unicode

 ╔═══╤╤═══╗
 ║ a │ b  │   c   ║
 ╟───┼┼───╢
 ║ 1 │ 2012-05-24 │ Hello ║
 ╟───┼┼───╢
 ║ 2 │ 2012-05-25 │ Hello ║
 ║   ││ World ║
 ╚═══╧╧═══╝
 (2 rows)


 Comments, ideas ?

 Regards

 Pavel








From ea7a046afbd6f517b37606f3e52bd2e401d1fad2 Mon Sep 17 00:00:00 2001
From: Pavel Stehule pavel.steh...@gooddata.com
Date: Tue, 11 Mar 2014 20:57:08 +0100
Subject: [PATCH] initial

---
 doc/src/sgml/ref/psql-ref.sgml |  36 
 src/bin/psql/command.c | 118 +++-
 src/bin/psql/print.c   | 199 -
 src/bin/psql/print.h   |  22 -
 src/bin/psql/startup.c |   5 ++
 src/bin/psql/tab-complete.c|  13 ++-
 6 files changed, 345 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8813be8..3e7e748 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2273,6 +2273,42 @@ lo_import 152801
   /para
   /listitem

Re: [HACKERS] The case against multixact GUCs

2014-03-11 Thread Alvaro Herrera

Sigh ...

Josh Berkus wrote:

 Further, there's no clear justification why these cannot be set to be
 the same as our other freeze ages (which our users also don't
 understand), or a constant calculated portion of them, or just a
 constant.

Calculated portion was my first proposal.  The objection that was raised
was that there's no actual correlation between Xid consumption rate and
multixact consumption rate.  That observation is correct; in some use
cases multixacts will be consumed faster, in others they will be
consumed slower.  So there's no way to have multixact cleanup not cause
extra autovacuum load if we don't have the parameters.

 Since nobody anticipated someone adding a GUC in a minor
 release, there was no discussion of this topic that I can find; the new
 GUCs were added as a side effect of fixing the multixact vacuum issue.
  Certainly I would have raised a red flag if the discussion of the new
 GUCs hadn't been buried deep inside really long emails.

When problems are tough, explanations get long.  There's no way around
that.  I cannot go highlighting text in red hoping that people will read
those parts.

 Adding new GUCs which nobody has any idea how to set, or can even
 explain to new users, is not a service to our users.  These should be
 removed.

I don't think we're going to remove the parameters.  My interpretation
of the paragraph above is can we please have some documentation that
explains how to set these parameters.  To that, the answer is sure, we
can.  However, I don't have time to write it at this point.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case against multixact GUCs

2014-03-11 Thread David Johnston

Josh Berkus wrote
Hackers,

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing. This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix. This was a mistake.

It probably should have been handled better but the decision to make these
parameters visible itself doesn't seem to be the wrong decision - especially
when limited to a fairly recently released back-branch.

What makes these GUCs worse is that nobody knows how to set them; nobody
on this list and nobody in the field. Heck, I doubt 1 in 1000 of our
users (or 1 in 10 people on this list) know what a multixact *is*.

That isn't a reason in itself to not have the capability if it is actually
needed.

Further, there's no clear justification why these cannot be set to be
the same as our other freeze ages (which our users also don't
understand), or a constant calculated portion of them, or just a
constant. Since nobody anticipated someone adding a GUC in a minor
release, there was no discussion of this topic that I can find; the new
GUCs were added as a side effect of fixing the multixact vacuum issue.
Certainly I would have raised a red flag if the discussion of the new
GUCs hadn't been buried deep inside really long emails.

The release documentation makes a pointed claim that the situation WAS that
the two were identical; but the different consumption rates dictated making
the multi-xact configuration independently configurable. So in effect the
GUC was always present - just not user-visible.

Even if there are not any current best practices surrounding this topic at
least this way as methods are developed there is an actual place to put the
derived value. As a starting point one can simply look at the defaults and,
if they have change the value for the non-multi value apply the same factor
to the custom multi-version.

Now, obviously someone has to think to actually do that - and the release
notes probably should have provided such guidance - but as I state
explicitly below the issue is more about insufficient communication and
education and less about providing the flexibility.

Adding new GUCs which nobody has any idea how to set, or can even
explain to new users, is not a service to our users. These should be
removed.

Or we should insist that those few that do have an understanding create some
kind of wiki document, or even a documentation section, to educate those
that are not as knowledgeable in this area.

For good reason much of the recent focus in this area has been actually
getting the feature to work. Presuming that it is a desirable feature -
which it hopefully is given it made it into the wild - to have then such
focus has obviously been necessary given the apparent complexity of this
feature (as evidenced by the recent serious bug reports) but hopefully the
feature itself is mostly locked down and education will begin.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/The-case-against-multixact-GUCs-tp5795561p5795573.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-11 Thread Tomas Vondra

Hi all,

a quick question that just occured to me - do you plan to tweak the cost
estimation fot GIN indexes, in this patch?

IMHO it would be appropriate, given the improvements and gains, but it
seems to me gincostestimate() was not touched by this patch.

I just ran into this while testing some jsonb stuff, and after creating
a GIN and GIST indexes on the same column, I get these two plans:

===

db=# explain analyze select count(*) from messages_2 where headers ?
'x-virus-scanned';

  QUERY PLAN

 Aggregate  (cost=1068.19..1068.20 rows=1 width=0) (actual
time=400.149..400.150 rows=1 loops=1)
   -  Bitmap Heap Scan on messages_2  (cost=10.44..1067.50 rows=278
width=0) (actual time=27.974..395.840 rows=70499 loops=1)
 Recheck Cond: (headers ? 'x-virus-scanned'::text)
 Rows Removed by Index Recheck: 33596
 Heap Blocks: exact=40978
 -  Bitmap Index Scan on messages_2_gist_idx  (cost=0.00..10.37
rows=278 width=0) (actual time=21.762..21.762 rows=104095 loops=1)
   Index Cond: (headers ? 'x-virus-scanned'::text)
 Planning time: 0.052 ms
 Total runtime: 400.179 ms
(9 rows)

Time: 400,467 ms

db=# drop index messages_2_gist_idx;
DROP INDEX

db=# explain analyze select count(*) from messages_2 where headers ?
'x-virus-scanned';
  QUERY PLAN

 Aggregate  (cost=1083.91..1083.92 rows=1 width=0) (actual
time=39.130..39.130 rows=1 loops=1)
   -  Bitmap Heap Scan on messages_2  (cost=26.16..1083.22 rows=278
width=0) (actual time=11.285..36.248 rows=70499 loops=1)
 Recheck Cond: (headers ? 'x-virus-scanned'::text)
 Heap Blocks: exact=23896
 -  Bitmap Index Scan on messages_2_gin_idx  (cost=0.00..26.09
rows=278 width=0) (actual time=7.974..7.974 rows=70499 loops=1)
   Index Cond: (headers ? 'x-virus-scanned'::text)
 Planning time: 0.064 ms
 Total runtime: 39.160 ms
(8 rows)

Time: 39,509 ms

===

So while the GIN plans seems to be just slightly expensive than GIN,
it's actually way faster.

Granted, most won't have GIN and GIST index on the same column at the
same time, but bad cost estimate may cause other issues. Maybe I could
achieve this by tweaking the various cost GUCs, but ISTM that tweaking
the cost estimation would be appropriate.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Tomas Vondra

Hi,

I've spent a few hours stress-testing this a bit - loading a mail
archive with ~1M of messages (with headers stored in a jsonb column) and
then doing queries on that. Good news - no crashes or any such issues so
far. The queries that I ran manually seem to return sane results.

The only problem I ran into is with limited index row size with GIN
indexes. I understand it's not a bug, but I admit I haven't realized I
might run into it in this case ...

The data I used for testing is just a bunch of e-mail messages, with
headers stored as jsonb, so each row has something like this in
headers column:

{
 from : John Doe j...@example.com,
 to : [Jane Doe j...@example.com, Jack Doe j...@example.com],
 cc : ...,
 bcc : ...,
 ... various other headers ...
}

The snag is that some of the header values may be very long, exceeding
the limit of 1352 bytes and causing errors like this:

  ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

A good example of such header is dkim-signature which basically
contains the whole message digitally signed with DKIM. The signature
tends to be long and non-compressible, thanks to the signature.

I'm wondering what's the best way around this, because I suspect many
new users (especially those attracted by jsonb and GIN improvements)
will run into this. Maybe not immediately, but eventully they'll try to
insert a jsonb with long value, and it will fail ...

With btree indexes on text I would probably create an index on
substr(column,0,1000) or something like that, but doing that with JSON
seems a bit strange.

I assume we need to store the actual values in the GIN index (so a hash
is not sufficient), right?

GIST indexes work, but with that I have to give up the significant
performance gains that we got thanks to Alexander's GIN patches.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

Josh Berkus j...@agliodbs.com writes:
 On 03/11/2014 06:57 AM, Tom Lane wrote:
 Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
 has been.  I'm just expecting lots of push-back if we try.  And it's kind
 of hard to resist push-back when you don't have a substitute to offer.

 Yeah, what we really need is encapsulated per-DB users and local
 superusers.  I think every agrees that this is the goal, but nobody
 wants to put in the work to implement a generalized solution.

Well ... you know, how hard would it really be?  Extend the primary
key of pg_authid to be (username, database_oid) with the convention
of database_oid = 0 for a globally known user.  Add some syntax to
CREATE USER to indicate whether you're creating a global user
(the default) or one known only within the current database.  I'm
not quite sure what to do with local users for pg_dump/pg_dumpall;
perhaps pg_dump should get the responsibility of creating local users?
But otherwise it seems like there are no issues that we'd not have to
solve anyway if we wanted to make db_user_name less of a crock.

In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Peter Geoghegan

On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote:
   ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

All index AMs have similar restrictions.

 A good example of such header is dkim-signature which basically
 contains the whole message digitally signed with DKIM. The signature
 tends to be long and non-compressible, thanks to the signature.

 I'm wondering what's the best way around this, because I suspect many
 new users (especially those attracted by jsonb and GIN improvements)
 will run into this. Maybe not immediately, but eventully they'll try to
 insert a jsonb with long value, and it will fail ...

The jsonb_hash_ops operator class just stores a 32-bit integer hash
value (it always sets the recheck flag, which only some of the other
default GIN opclass' strategies do). It only supports containment, and
not the full variety of operators that the default opclass supports,
which is why it isn't the default. I think that in practice the
general recommendation will be that when indexing at the top level,
use jsonb_hash_ops. When indexing nested items, use the more flexible
default GIN opclass. That seems like a pretty smart trade-off to me.

The more I think about it, the more inclined I am to lose GiST support
entirely for the time being. It lets us throw out about 700 lines of C
code, which is a very significant fraction of the total, removes the
one open bug, and removes the least understood part of the code. The
GiST opclass is not particularly compelling for this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Store Extension Options

On 11 March 2014 18:33, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.

 Well, that is an interesting analogy, but I'm not sure how much it applies
 here.  In the case of a GUC, you can fairly easily validate it once the
 module does get loaded (and before the module actually tries to do
 anything with it).  I don't see how that's going to work for table
 options.  I trust nobody is seriously proposing that on module load, we're
 going to scan the whole of pg_class looking to see if there are incorrect
 settings.  (Even if we did, what would we do about it?  Not try to force a
 pg_class update, for sure.  And what if the module is loading into the
 postmaster thanks to a preload spec?)

Thank goodness for that. Strict validation does seem scary.

 I don't really think partial validation makes sense.  We could just remove
 the whole topic, and tell extension authors that it's up to them to defend
 themselves against bizarre values stored for their table options.  But I'm
 wondering if there's really so much use-case for a feature like that.

DBAs are fairly used to the idea that if you put crap data in the
database then bad things happen. We provide the table, they provide
the data. Validation is possible, but not enforced as essential.
(Except in terms of the datatype - but then we are also validating
data to specific types here).

So I think that DBAs will also cope rather well with table-level
options without us nannying them.

There is nothing more annoying that needing to run scripts in a
specific sequence to make them work, or dumps that fail because
certain modules aren't loaded yet (or cannot ever be so). And maybe
the DBA wants to annotate tables based on a design and then later move
to implement modules to take advantage of the annotation.

Having an option be set and yet be unvalidated and/or unused is no
more annoying than having a column in a table that is known incorrect
and/or not accessed. Searching for badly set options needs to be
possible, even easy, but hard validation can cause problems. And if we
try and force it, whats to stop people from using a dummy validator
just to circumvent the strictness?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

Andrew Dunstan and...@dunslane.net writes:
 The docs say:

 db_user_namespace causes the client's and server's user name
 representation to differ. Authentication checks are always done with
 the server's user name so authentication methods must be configured
 for the server's user name, not the client's. Because md5 uses the
 user name as salt on both the client and server, md5 cannot be used
 with db_user_namespace.

 Is that the only major issue?

That's one major issue.

Another one is the hacky way of distinguishing global from per-db users
(ie, user must append @ to log in as a global user).  I'd love to get rid
of that requirement, but not sure how.  Would it be all right for the
server to first probe for user@db and then for just user, or vice versa?
How would this integrate with pg_hba.conf?

 Why not have the server strip out the @db part if this is on?

Hmm ... that's a thought.  IIRC, the client doesn't actually know that
this is going on, it just sends the user name, and hashes against that
too.  If the server remembered the bare user name it could hash against
that as well.

At least, that would work for db-local usernames.  It *doesn't* work for
global names, because the client side has no idea that the @ ought to not
be counted for hashing.  Again, if we could get rid of that convention,
it'd be far less messy.

A possible objection would be that changing the definition of what to hash
would invalidate existing MD5 password hashes; but since MD5 passwords
have never been allowed with db_user_namespace names anyway, that doesn't
seem very forceful.

 If we made this an initdb-time setting rather than a 
 GUC then we'd remove the problems caused by turning this on and off.

Why do we need to restrict that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure



On 03/11/2014 07:41 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

The docs say:
 db_user_namespace causes the client's and server's user name
 representation to differ. Authentication checks are always done with
 the server's user name so authentication methods must be configured
 for the server's user name, not the client's. Because md5 uses the
 user name as salt on both the client and server, md5 cannot be used
 with db_user_namespace.
Is that the only major issue?

That's one major issue.

Another one is the hacky way of distinguishing global from per-db users
(ie, user must append @ to log in as a global user).  I'd love to get rid
of that requirement, but not sure how.  Would it be all right for the
server to first probe for user@db and then for just user, or vice versa?


user@db first, I think. I guess that means don't name your global users 
the same as db-specific users. Maybe we should prevent a conflict? Or if 
we allow a conflict then only require user@ in conflicting cases.




How would this integrate with pg_hba.conf?


Not sure.




Why not have the server strip out the @db part if this is on?

Hmm ... that's a thought.  IIRC, the client doesn't actually know that
this is going on, it just sends the user name, and hashes against that
too.  If the server remembered the bare user name it could hash against
that as well.

At least, that would work for db-local usernames.  It *doesn't* work for
global names, because the client side has no idea that the @ ought to not
be counted for hashing.  Again, if we could get rid of that convention,
it'd be far less messy.



Right.



A possible objection would be that changing the definition of what to hash
would invalidate existing MD5 password hashes; but since MD5 passwords
have never been allowed with db_user_namespace names anyway, that doesn't
seem very forceful.


Agreed.




If we made this an initdb-time setting rather than a
GUC then we'd remove the problems caused by turning this on and off.

Why do we need to restrict that?





Oh, probably a thinko in my part. I thought the setting would affect the 
stored password, but now I think about it some more I see it doesn't.



I think we might be on the track of something useful here.


cheers

andre


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Peter Geoghegan

On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that in practice the
 general recommendation will be that when indexing at the top level,
 use jsonb_hash_ops. When indexing nested items, use the more flexible
 default GIN opclass. That seems like a pretty smart trade-off to me.

By which I mean: index nested items using an expressional GIN index.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Josh Berkus

On 03/11/2014 06:57 AM, Tom Lane wrote:
 Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
 has been.  I'm just expecting lots of push-back if we try.  And it's kind
 of hard to resist push-back when you don't have a substitute to offer.

Yeah, what we really need is encapsulated per-DB users and local
superusers.  I think every agrees that this is the goal, but nobody
wants to put in the work to implement a generalized solution.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure



On 03/11/2014 09:37 PM, Tom Lane wrote:



In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.




Should be workable if you're creating a local name - just check against 
the list of global roles. For creating global names, maybe we could keep 
an invisible shared catalog (or maybe only visible to global superusers) 
of local names / db oids. New global names would be checked against that 
catalog. Something like that.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

Andrew Dunstan and...@dunslane.net writes:
 On 03/11/2014 09:37 PM, Tom Lane wrote:
 In particular, I'd like to see an exclusion that prevents local users
 from having the same name as any global user, so that we don't have
 ambiguity in GRANT and similar commands.  This doesn't seem simple to
 enforce (if we supported partial indexes on system catalogs, it would
 be ...) but surely this representation is more amenable to enforcing it
 than the existing one.

 Should be workable if you're creating a local name - just check against 
 the list of global roles.

Concurrent creations won't be safe without some sort of locking scheme.
A unique index would be a lot better way of plugging that hole than a
system-wide lock on user creation.  But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Aidan Van Dyk

So I'll admit to using it, only in toy setups...

I use it with trust and ident, on local connections though, not password

I try to keep my laptops clean of mysqld, and I use PG.  And only on my
laptop/PC,  I make a database for every user...  And every app get's a
userid, and a schemaEvery user get's passwordless access to their
database.  And the userid associates the app, and the defaults that get
used on their connections.

So, I think it's neat, but wouldn't be put out if it was removed ...


On Tue, Mar 11, 2014 at 9:47 AM, Magnus Hagander mag...@hagander.netwrote:

 On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
  It will be 12 years this year since this temporary measure was
  added.  I'm just wondering, is there any complete solution that
  anyone had in mind yet?  Or should this just be deprecated?

  I'd say +1 to remove it. That would also make it possible to get id of
  password authentication...

 If we remove it without providing a substitute feature, people who are
 using it will rightly complain.

 Are you claiming there are no users, and if so, on what evidence?


 I am claiming that I don't think anybody is using that, yes.

 Based on the fact that I have *never* come across it on any system I've
 come across since, well, forever. Except once I think, many years ago, when
 someone had enabled it by mistake and needed my help to remove it...

 But we should absolutely deprecate it first in that place. Preferrably
 visibly (e.g. with a log message when people use it). That could at least
 get those people who use it to let us know they do, to that way figure out
 if they do - and can de-deprecate it.

 Or if someone wants to fix it properly of course :)

 --
  Magnus Hagander
  Me: http://www.hagander.net/
  Work: http://www.redpill-linpro.com/




-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a
slave.

Re: [HACKERS] db_user_namespace a temporary measure



On 03/11/2014 11:06 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 03/11/2014 09:37 PM, Tom Lane wrote:

In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.

Should be workable if you're creating a local name - just check against
the list of global roles.

Concurrent creations won't be safe without some sort of locking scheme.
A unique index would be a lot better way of plugging that hole than a
system-wide lock on user creation.  But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).





Create (joe, db1), (joe, db2) ... for each global user? Might get a tad 
ugly if you have lots of global users or lots of databases.


Just trying to be a bit creative :-)

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Jaime Casanova

On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).


and why you want that restriction? when you login you need to specify
the db, right? if you don't specify the db then is the global 'joe'
that want to login

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure