date:20120612

Re: [HACKERS] 9.2 final

2012-06-12 Thread Simon Riggs

On 11 June 2012 18:02, Joshua Berkus j...@agliodbs.com wrote:

 Hmmm.  I was assuming September, given how late the beta came out, and that 
 nobody has previously talked seriously about a June release.  I'll also point 
 out that while there's a beta2 tarball, there was no announcement and no 
 packages for it.

 If we decide to do June, then PR will be minimal because I was assuming I had 
 another 7 weeks to prepare it.  Not that that should be the deciding factor 
 (it would be great to get out an early release and get it out of the way) but 
 it should be taken into consideration.


Not really sure why we're discussing it.

We've always had a long beta cycle and this seems to cut it very short
at both ends.

If we're going to have a fixed cutoff date for patches then having a
variable release date seems weird, unless it is to elongate it to fix
bugs.

Let me put it this way: is there a benefit to changing the plan?
Anyone desperate for the new features is already using them, and we're
relying upon that to throw up bugs during beta.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-06-12 Thread Simon Riggs

On 12 June 2012 03:38, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello, sorry for vague understanding.

  I depend on this and suppose we can omit it if latest checkpoint
  has been taken so as to be able to do crash recovery thereafter.

 I don't see any reason to special case this. If a checkpoint has no
 work to do, then it will go very quickly. Why seek to speed it up even
 further?

 I want the standby to start to serve as soon as possible even by
 a few seconds on failover in a HA cluster.

Please implement a prototype and measure how many seconds we are discussing.


  This condition could be secured by my another patch for
  checkpoint_segments on standby.

 More frequent checkpoints are very unlikely to secure a condition that
 no checkpoint at all is required at failover.

 I understand that checkpoint at the end of recovery is
 indispensable to ensure the availability of crash recovery
 afterward. Putting aside the convention about TLI increment and
 shutdown checkpoint, shutdown checkpoints there seems for me to
 be omittable if (and not 'only if', I suppose) crash recovery is
 available at the time.

 Shutdown checkpoint itself seems dispansable to me, but I'm
 shamingly not convinced so taking the TLI convention into
 consideration.


 Making a change that has a negative effect for everybody, in the hope
 of sometimes improving performance for something that is already fast
 doesn't seem a good trade off to me.

 Hmm.. I suppose the negative effect you've pointed is possible
 slowing down of hot-standby by the extra checkpoints being
 discussed in another thread, is it correct? Could you accept this
 kind of modification if it could be turned off by, say, GUC?


This proposal is for a performance enhancement. We normally require
some proof that the enhancement is real and that it doesn't have a
poor effect on people not using it. Please make measurements.

It's easy to force more frequent checkpointsif you wish them, so
please compare the case of more frequent checkpoints.


 Regrettably, the line of thought explained here does not seem useful to me.

 As you know, I was working on avoiding shutdown checkpoints completely
 myself. You are welcome to work on the approach Fujii and I discussed.

 Sorry, I'm afraid that I've failed to find that discussion. Could
 you let me have a pointer to that? Of cource I'd be very happy if
 the checkpoints are completely avoided on the approach.

Discussion on a patch submitted to me to the Januray 2012 CommitFest
to reduce failover time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --xlog compatibility break

2012-06-12 Thread Thom Brown

On 11 June 2012 22:40, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jun 11, 2012 at 11:28 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-10 at 13:43 +0200, Magnus Hagander wrote:
 On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote:
  On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote:
  Yeah, good arguments all around, i agree too :-) Next question is -
  suggestions for naming of said paramter?
 
  --xlog-method=something?  And/or -Xsomething, which would automatically
  enable -x?

 How's this?

 I wouldn't make -x and -X exclusive.  The way I understood this is, -x
 means include xlog, and -X says how to.

 I guess either way of looking at it has its merits.

 I guess it's basically two ways of doing the same thing. I'm not
 especially attached to either one of them, so if you think the ohter
 one is better, I won't object to changing it.

+1 for not telling the user off for being explicit by stating both options.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] xml_is_document and selective pg_re_throw

2012-06-12 Thread Nikhil Sontakke

Hi,

Consider:

SELECT xml 'foobar/foobarfoo/bar' IS DOCUMENT;

And I was looking at xml_is_document() source code. It calls xml_parse
which throws an error with code set to ERRCODE_INVALID_XML_DOCUMENT. The
catch block of xml_parse then rethrows.

Now xml_is_document does a selective rethrow only if the error is not
ERRCODE_INVALID_XML_DOCUMENT. I can understand that this function does this
to return true/false, but doesn't this behavior of not propagating the
error up all the way dangerous? InterruptHoldoffCount inconsistencies for
instance?

A better way would have been to modify xml_parse to take an additional
boolean argument to_rethrow and not to rethrow if that is false?
Thoughts?

Regards,
Nikhils

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-06-12 Thread Kyotaro HORIGUCHI

Hello, Thank you to head me the previous discussion. I'll
consider them from now.

  I want the standby to start to serve as soon as possible even by
  a few seconds on failover in a HA cluster.
 
 Please implement a prototype and measure how many seconds we
 are discussing.

I'm sorry to have omitted measurement data. (But it might be
shown in previous discussion.)

Our previous measurement of failover of PostgreSQL 9.1 +
Pacemaker on some workload showed that shutdown snapshot takes 8
seconds out of 42 seconds of total failover time (about 20%).

OS: RHEL6.1-64
DBMS  : PostgeSQL 9.1.1
HA: pacemaker-1.0.11-1.2.2 x64
Repl  : sync
Workload  : master : pgbench / scale factor = 100 (aprx. 1.5GB)
standby: none (warm-standby)

shared_buffers  = 2.5GB
wal_buffers = 4MB
checkpoint_segments = 300
checkpoint_timeout  = 15min
checpoint_completion_target = 0.7
archive_mode= on

WAL segment comsumption was about 310 segments / 15 mins under
the condition above.

 This proposal is for a performance enhancement. We normally require
 some proof that the enhancement is real and that it doesn't have a
 poor effect on people not using it. Please make measurements.

On the benchmark above, extra load by more frequent (but the same
to the its master) checkpoint is not a problem. On the other
hand, failover time is expected to be shortened to 34 seconds
from 42 seconds by omitting the shutdown checkpoint.
(But I have not measured that..)

 Discussion on a patch submitted to me to the Januray 2012 CommitFest
 to reduce failover time.

Thank you and I'm sorry for missing it. I've found that
discussions and read them from now.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Kevin Grittner

Dave Page  wrote:
 
 Probably the most common issue we see from Windows users of
 PostgreSQL is confusion over the passwords the installer asks for
 during installation and upgrade.
 
Yeah, I think so.
 
 Attached are some screenshots of the current installation and
 upgrade steps in question, and the password incorrect message box.
 The suggested new text for the steps that I've come up with is
 below. I'd welcome suggestions or improvements to help reduce
 pgsql-bugs traffic!
 
Are they running the installation as a system administrator?  If so,
rather than throwing up an error message and telling them to go use
other tools to reset the password, is it possible for the
administrator account to force a password change?  If that is
possible, it seems like it would be a lot more friendly.  If not,
perhaps the old postgres user could be renamed, and a new one created
with the password?
 
Unless there are technical limitations of the Windows environment
which make it impossible to fix this from within the installer, it
just seems like we should fix the problem rather than putting up an
error dialog box.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Magnus Hagander

On Tue, Jun 12, 2012 at 2:26 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Dave Page  wrote:

 Probably the most common issue we see from Windows users of
 PostgreSQL is confusion over the passwords the installer asks for
 during installation and upgrade.

 Yeah, I think so.

 Attached are some screenshots of the current installation and
 upgrade steps in question, and the password incorrect message box.
 The suggested new text for the steps that I've come up with is
 below. I'd welcome suggestions or improvements to help reduce
 pgsql-bugs traffic!

 Are they running the installation as a system administrator?  If so,
 rather than throwing up an error message and telling them to go use
 other tools to reset the password, is it possible for the
 administrator account to force a password change?  If that is
 possible, it seems like it would be a lot more friendly.  If not,
 perhaps the old postgres user could be renamed, and a new one created
 with the password?

That might break another app running nuder that account. Such as a
different version of PostgreSQL...

But an option could be to create a different account to run it under,
I guess... Leaving the old one where it is. I think that's better than
renaming the old one, really.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Kevin Grittner

Magnus Hagander  wrote:
 Kevin Grittner  wrote:
 
 Are they running the installation as a system administrator? If
 so, rather than throwing up an error message and telling them to
 go use other tools to reset the password, is it possible for the
 administrator account to force a password change? If that is
 possible, it seems like it would be a lot more friendly. If not,
 perhaps the old postgres user could be renamed, and a new one
 created with the password?
 
 That might break another app running nuder that account. Such as a
 different version of PostgreSQL...
 
 But an option could be to create a different account to run it
 under, I guess... Leaving the old one where it is. I think that's
 better than renaming the old one, really.
 
That makes sense.  I just think we should try very hard to make the
installer just work to the extent possible, rather than trying to
direct the user in how to use system tools in the middle of the
process.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread David Fetter

On Tue, Jun 12, 2012 at 09:38:54AM +0100, Simon Riggs wrote:
 On 11 June 2012 18:02, Joshua Berkus j...@agliodbs.com wrote:
 
  Hmmm.  I was assuming September, given how late the beta came out,
  and that nobody has previously talked seriously about a June
  release.  I'll also point out that while there's a beta2 tarball,
  there was no announcement and no packages for it.
 
  If we decide to do June, then PR will be minimal because I was
  assuming I had another 7 weeks to prepare it.  Not that that
  should be the deciding factor (it would be great to get out an
  early release and get it out of the way) but it should be taken
  into consideration.
 
 
 Not really sure why we're discussing it.
 
 We've always had a long beta cycle and this seems to cut it very
 short at both ends.

I'm not sure I understand how having a long beta cycle is a good
thing, or even whether you're saying it is.  Could you please clarify?

 If we're going to have a fixed cutoff date for patches then having a
 variable release date seems weird, unless it is to elongate it to fix
 bugs.
 
 Let me put it this way: is there a benefit to changing the plan?

Sure.  If it's ready to go, there's no point holding it back.
Downstream projects get a little extra integration time, etc.

 Anyone desperate for the new features is already using them, and we're
 relying upon that to throw up bugs during beta.

I'm always grateful to those brave souls who go there with versions
of PostgreSQL that aren't even beta.  They're a big chunk of why it
might well be OK to step toward this major release :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Robert Haas

On Mon, Jun 11, 2012 at 2:16 AM, Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2012-06-06 at 22:16 -0400, Noah Misch wrote:
 Note that, currently, only VACUUM sets PD_ALL_VISIBLE and visibility map 
 bits.
 Would you make something else like heap_multi_insert() be able to do so?

 That was the plan (roughly). I was thinking about doing it at the time a
 new page was allocated.

 Avoiding measurable overhead in tuple visibility checks when the feature is
 inactive may well prove to be a key implementation challenge.

 Perhaps a rudimentary CLOG cache, or some other way to mitigate CLOG
 access could make it bearable.

 Although I would like it to be an online operation, I'm not quite as
 concerned about reads. I'd like to mitigate any major penalty, but if
 reads are expensive during a load, than so be it.

  Then, it would remember the current xid
  as max_loader_xid, and wait until the global xmin is greater than
  max_loader_xid. This should ensure that all snapshots regard all loading
  transactions as complete.

 ... this might not be.  Each backend could decide, based on its own xmin,
 whether to ignore PD_ALL_VISIBLE in a given table.  In other words, your
 ignorehints flag could be an xmin set to InvalidTransactionId during stages 1
 and 2 and to the earliest safe xmin during stages 0 and 3.

 That's a good idea. It might make it easier to implement, and removing a
 step from finalization is certainly a big plus.

   * INITIATE and FINALIZE probably need to use PreventTransactionChain()
  and multiple transactions, to avoid holding the ShareUpdateExclusiveLock
  for too long. Also, we want to keep people from using it in the same
  transaction as the loading xact, because they might not realize that
  they would get a concurrency of 1 that way (because of the
  ShareUpdateExclusiveLock).

 Yes.  You need to commit the transaction modifying pg_class so other backends
 can observe the change, at which point you can gather the list to wait on.

 Consider splitting the INITIATE UI into two interfaces, one that transitions
 from state 0 to state 1 and another that expects state 1 and blocks until we
 reach state 2.  You then have no need for PreventTransactionChain(), and the
 interfaces could even be normal functions.  It's less clear how reasonably 
 you
 could do this for the FINALIZE step, given its implicit VACUUM.  It could be
 achieved by having the user do the VACUUM and making the new interface merely
 throw an error if a VACUUM is still needed.  The trivial usage pattern might
 look like this:

 SELECT pg_initiate_load('bigtbl');
 SELECT pg_wait_load('bigtbl'); -- not a great name
 COPY bigtbl FROM STDIN;
 SELECT pg_stop_load('bigtbl');
 VACUUM bigtbl;
 SELECT pg_finalize_load('bigtbl');

 It's definitely less elegant, alas.  Perhaps offer the interface you've
 proposed and have it do the above under the hood.  That way, users with
 complex needs have the flexibility of the lower-level interfaces while those
 who can tolerate PreventTransactionChain() have simplicity.

 I think that's a reasonable suggestion. I am going back and forth a
 little on this one. It's got the benefit that you can see the internal
 states more clearly, and it's easier to tell what's going on, and it's
 better if we want to do more sophisticated testing.

 The main drawback here is that it's exposing more to the user. I
 imagined that we might want to push other kinds of optimizations into
 the load path, and that might upset the interface you've described
 above. Then again, we'll probably need the normal, load, and transition
 states regardless, so maybe it's an empty concern.

Instead of trying to maintain MVCC semantics, maybe we should just
have something like COPY (FROZEN) that just writes frozen tuples into
the table and throws MVCC out the window.  Seems like that would be a
lot easier to implement and satisfy basically the same use cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ability to listen on two unix sockets

2012-06-12 Thread Honza Horak


On 06/11/2012 11:47 PM, Peter Eisentraut wrote:

On sön, 2012-06-10 at 17:24 -0400, Robert Haas wrote:

and also affects the naming of any UNIX sockets created.


Why would that matter?  If you configure M ports and N Unix socket
locations, you get M*N actual sockets created.


...I *seriously* doubt that this is the behavior anyone wants.
Creating M sockets per directory seems patently silly.


How else would it work?

If I say, syntax aside, listen on ports 5432 and 5433, and use socket
directories /tmp and /var/run/postgresql, then a libpq-using client
would expect to be able to connect using

-h /tmp -p 5432
-h /tmp -p 5433
-h /var/run/postgresql -p 5432
-h /var/run/postgresql -p 5433


This could be true in case all listening ports are equal, which I guess 
isn't a good idea, because IIUIC the port number as a part of the socket 
name is used for distinguish sockets of various postmasters in the same 
directory. In that scenario every client should know which port to 
connect and also which one is primary.


Regards,
Honza

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Dave Page

On Tue, Jun 12, 2012 at 1:35 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:
 Kevin Grittner  wrote:

 Are they running the installation as a system administrator? If
 so, rather than throwing up an error message and telling them to
 go use other tools to reset the password, is it possible for the
 administrator account to force a password change? If that is
 possible, it seems like it would be a lot more friendly. If not,
 perhaps the old postgres user could be renamed, and a new one
 created with the password?

 That might break another app running nuder that account. Such as a
 different version of PostgreSQL...

Right.

 But an option could be to create a different account to run it
 under, I guess... Leaving the old one where it is. I think that's
 better than renaming the old one, really.

I'm not keen on adding additional user accounts - that's a security
problem imho. It'll leave the unaware user with multiple accounts on
the system, and may cause those that do understand what's going on
pain because they'll have to deal with multiple accounts for things
like server-side copy.

It also doesn't solve the problem during upgrades, though admittedly
that seems to be less common.

 That makes sense.  I just think we should try very hard to make the
 installer just work to the extent possible, rather than trying to
 direct the user in how to use system tools in the middle of the
 process.

Right - that's what always aim to do (and in fact was the number one
driver behind the current generation of installers), and provided the
user remembers their password it works just fine.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Magnus Hagander

On Tue, Jun 12, 2012 at 2:48 PM, Dave Page dp...@pgadmin.org wrote:
 On Tue, Jun 12, 2012 at 1:35 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:
 Kevin Grittner  wrote:

 Are they running the installation as a system administrator? If
 so, rather than throwing up an error message and telling them to
 go use other tools to reset the password, is it possible for the
 administrator account to force a password change? If that is
 possible, it seems like it would be a lot more friendly. If not,
 perhaps the old postgres user could be renamed, and a new one
 created with the password?

 That might break another app running nuder that account. Such as a
 different version of PostgreSQL...

 Right.

 But an option could be to create a different account to run it
 under, I guess... Leaving the old one where it is. I think that's
 better than renaming the old one, really.

 I'm not keen on adding additional user accounts - that's a security
 problem imho. It'll leave the unaware user with multiple accounts on
 the system, and may cause those that do understand what's going on
 pain because they'll have to deal with multiple accounts for things
 like server-side copy.

Oh, I certainly wouldn't do it without *informing* and verifying it
with the user.


 It also doesn't solve the problem during upgrades, though admittedly
 that seems to be less common.

Why do you need the account at all during upgrades? Don't you just
stop the service and replace the binaries?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Robert Haas

On Mon, Jun 11, 2012 at 1:02 PM, Joshua Berkus j...@agliodbs.com wrote:
 If we decide to do June, then PR will be minimal because I was assuming I had 
 another 7 weeks to prepare it.  Not that that should be the deciding factor 
 (it would be great to get out an early release and get it out of the way) but 
 it should be taken into consideration.

Well, so far, enthusiasm for getting a release out sooner than the
fall seems pretty limited.  Barring an uptick in enthusiasm, I think
you're going to get your 7 weeks.

Personally, I like getting releases out the door sooner, because then
people can start using the features sooner.  And that's the whole
point, isn't it?  But with no PR and no installers it won't be much of
a release, so I think we have to wait.  We might want to try to
start putting together some dates for beta3 and/or rc1, though, even
if they're a bit further out, so that the people who have to do work
around that can know what dates they're trying to hit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Dave Page

On Tue, Jun 12, 2012 at 1:49 PM, Magnus Hagander mag...@hagander.net wrote:


 I'm not keen on adding additional user accounts - that's a security
 problem imho. It'll leave the unaware user with multiple accounts on
 the system, and may cause those that do understand what's going on
 pain because they'll have to deal with multiple accounts for things
 like server-side copy.

 Oh, I certainly wouldn't do it without *informing* and verifying it
 with the user.

That'll add additional steps for all users, and likely confuse the
novices even more.

 It also doesn't solve the problem during upgrades, though admittedly
 that seems to be less common.

 Why do you need the account at all during upgrades? Don't you just
 stop the service and replace the binaries?

Because re-running the current installer or running an upgrade should
repair an existing installation as well as doing any upgrades
required.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New Postgres committer: Kevin Grittner

2012-06-12 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 I am pleased to announce that Kevin Grittner has accepted the core
 committee's invitation to become our newest committer.

I'm hesitant whether to congratulate more the community for gaining
Kevin or Kevin for joining as a commiter :) In short, +1 !

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 8:53 AM, Dave Page dp...@pgadmin.org wrote:
 Oh, I certainly wouldn't do it without *informing* and verifying it
 with the user.

 That'll add additional steps for all users, and likely confuse the
 novices even more.

The real issue here is that it's nuts to tell the user please enter
either a new password or the password for the account that already
exists, but I'm not telling you which one.

What we need is to display a different dialogue based on the situation.

If the account already exists, we should say Please enter the
password for the existing postgres account.  If you do not know the
password, you can reset it using the Windows control panel.

But if it doesn't already exist, we should say The installer will
create a postgres account on this machine.  Please enter a password
for the new account.

If we can do that, all of these problems will go away.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 Let me put it this way: is there a benefit to changing the plan?

The plan is, and always has been, we'll release when it's ready.
We generally suppose that a release is ready when the rate of bug
reports against the beta has dropped off substantially.  It's certainly
absurd to claim that 9.2 has reached that stage.

Personally I'd very much like to see 9.2 out before the fall, mainly
because I'd like to get it into Fedora 18 (which would require 9.2.0
to be out by early September at the latest).  But getting it out in
June would require abandoning every standard for release quality we've
ever had.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Dave Page

On Tue, Jun 12, 2012 at 2:57 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 12, 2012 at 8:53 AM, Dave Page dp...@pgadmin.org wrote:
 Oh, I certainly wouldn't do it without *informing* and verifying it
 with the user.

 That'll add additional steps for all users, and likely confuse the
 novices even more.

 The real issue here is that it's nuts to tell the user please enter
 either a new password or the password for the account that already
 exists, but I'm not telling you which one.

That's a good point.

 What we need is to display a different dialogue based on the situation.

 If the account already exists, we should say Please enter the
 password for the existing postgres account.  If you do not know the
 password, you can reset it using the Windows control panel.

 But if it doesn't already exist, we should say The installer will
 create a postgres account on this machine.  Please enter a password
 for the new account.

 If we can do that, all of these problems will go away.

Yeah - that'll require some additional code to check if the account
exists, but we can probably copy/paste that from the existing utility
that creates the account (or better yet, refactor it to allow us to
check or check  create as it does now).

Ashesh/Sachin/Dharam - do you see any potential issues with that?

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Magnus Hagander

On Tue, Jun 12, 2012 at 4:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Let me put it this way: is there a benefit to changing the plan?

 The plan is, and always has been, we'll release when it's ready.
 We generally suppose that a release is ready when the rate of bug
 reports against the beta has dropped off substantially.  It's certainly
 absurd to claim that 9.2 has reached that stage.

 Personally I'd very much like to see 9.2 out before the fall, mainly
 because I'd like to get it into Fedora 18 (which would require 9.2.0
 to be out by early September at the latest).  But getting it out in
 June would require abandoning every standard for release quality we've
 ever had.

Yeah, I don't hink that's reasonable at all.

And it doesn't make sense to release in july or at least first half of
august due to vacations. Maybe all of august. Which has us in early
september anyway.

But if we believe we've taken care of all known open issues fairly
soon, getting an RC out rather than a beta before people go on
vacation would probably be a good thing...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-12 Thread Bruce Momjian

On Mon, Jun 11, 2012 at 05:57:41PM -0400, Alvaro Herrera wrote:
 
 Excerpts from Bruce Momjian's message of lun jun 11 15:44:16 -0400 2012:
  
  On Mon, Jun 11, 2012 at 12:20:13PM -0400, Alvaro Herrera wrote:
 Hm, does this touch stuff that would also be modified by perltidy?  I
 wonder if we should refrain from doing entab/detab on perl files and
 instead have perltidy touch such code.
   
The Perl files were modified by perltidy and not by pgindent, as
documented in the pgindent README:

9) Indent the Perl MSVC code:

cd src/tools/msvc
perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm
   
   Oh, I see.  That's great then.  Should those change be committed
   separately, just to avoid confusion?  BTW those aren't the only Perl
  
  Not sure.  I just followed the README instructions.  I should just
  probably mention the Perl files were not processed by pgindent on the
  commit.
 
 Well, you wrote the instructions yourself :-)

Well, initially, yes, but others have improved them over time.

   files in the source tree -- we also have the genbki stuff, for example.
   (There is already some inconsistency in tabs/spaces in genbki.pl
   already)
  
  I was not aware of them.  If you want them run, would you update the
  pgindent README to mention them please?
 
 What about something like this in the root of the tree:
 find . -name \*.pl -o -name \*.pm | xargs perltidy -b -bl -nsfs -naws -l=100 
 -ole=unix
 
 There are files all over the place.  The file that would most be
 affected with one run of this is the ECPG grammar generator.

Sounds like a good idea to me.

 I checked the -et=4 business (which is basically entab).  We're pretty
 inconsistent about tabs in perl code it seems; some files use tabs
 others use spaces.  Honestly I would just settle on what we use on C
 files, even if the Perl devs don't recommend it because of
 maintainability and portability.  I mean if it works well for us for C
 code, why would it be a problem in Perl code?  However, I don't write
 much of that Perl code myself.

Yes, I would love to hear a Perl person chime in here to tell us it is
OK, as you stated.

I suppose if we don't get any feedback in a few days, let's just go
ahead and make the changes you suggested.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Noah Misch

On Mon, Jun 11, 2012 at 11:03:10PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  CREATE FUNCTION + ALTER FUNCTION OWNER TO is useful for creating another
  user's untrusted-language SECURITY DEFINER function.  ALTER FUNCTION 
  CALLED ON
  NULL INPUT ought to require that the user be eligible to redefine the 
  function
  completely.
 
  Here's a patch implementing that restriction.  To clarify, I see no need to
  repeat *all* the CREATE-time checks; for example, there's no need to recheck
  permission to use the return type.  The language usage check is enough.
 
 This seems bizarre and largely unnecessary.  As you stated to begin
 with, granting ownership of a function implies some degree of trust.

Yes, but I would never expect that level of trust to include access to crash
the server as a consequence of the function's reliance on STRICT.

 I do not want to get into the business of parsing exactly which variants
 of ALTER FUNCTION ought to be considered safe.

Fair concern.

 And I definitely don't
 want to add a check that enforces restrictions against cases that have
 got nothing whatever to do with C-language functions, as this patch
 does.

We don't have a principled basis for assuming that this hazard cannot apply to
third-party untrusted languages.  We could add another pg_language flag to
make the distinction for languages like plperlu.  They're untrusted by virtue
of granted access beyond the database, but no mismatch between the function
definition and the function implementation can crash the server or similar.
Adding such a thing at this point seems excessive to me.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Josh Berkus


 But if we believe we've taken care of all known open issues fairly
 soon, getting an RC out rather than a beta before people go on
 vacation would probably be a good thing...

Yeah, that's the other reason I'm not so wild about a June release; I
have a whole list of tests I want to do against beta2, and to date I've
done none of them.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] reviewers for the upcoming CommitFest

2012-06-12 Thread Robert Haas

Folks,

Per discussion at the PGCon developers meeting, everyone who submitted
patches to the upcoming CommitFest should also sign up to review an
equal number of patches (of roughly comparable complexity).  Of
course, we also need people who haven't submitted anything to
volunteer to review patches.  Many patches, especially large ones, can
benefit from multiple reviewers.

http://wiki.postgresql.org/wiki/Reviewing_a_Patch

I am sure there will be a bunch of last-minute submissions, probably
large, but it would nice if people could start signing up for things
now, as there are a lot of blanks at present.  Edit the patch you're
going to review, and put your name in as a reviewer.

https://commitfest.postgresql.org/action/commitfest_view?id=14

Post your review on pgsql-hackers as a reply the original thread,
ideally within 4-5 days of the start of the CommitFest, which is due
to begin this Friday, June 15th.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Merlin Moncure

On Tue, Jun 12, 2012 at 7:42 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jun 11, 2012 at 2:16 AM, Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2012-06-06 at 22:16 -0400, Noah Misch wrote:
 Note that, currently, only VACUUM sets PD_ALL_VISIBLE and visibility map 
 bits.
 Would you make something else like heap_multi_insert() be able to do so?

 That was the plan (roughly). I was thinking about doing it at the time a
 new page was allocated.

 Avoiding measurable overhead in tuple visibility checks when the feature is
 inactive may well prove to be a key implementation challenge.

 Perhaps a rudimentary CLOG cache, or some other way to mitigate CLOG
 access could make it bearable.

 Although I would like it to be an online operation, I'm not quite as
 concerned about reads. I'd like to mitigate any major penalty, but if
 reads are expensive during a load, than so be it.

  Then, it would remember the current xid
  as max_loader_xid, and wait until the global xmin is greater than
  max_loader_xid. This should ensure that all snapshots regard all loading
  transactions as complete.

 ... this might not be.  Each backend could decide, based on its own xmin,
 whether to ignore PD_ALL_VISIBLE in a given table.  In other words, your
 ignorehints flag could be an xmin set to InvalidTransactionId during stages 
 1
 and 2 and to the earliest safe xmin during stages 0 and 3.

 That's a good idea. It might make it easier to implement, and removing a
 step from finalization is certainly a big plus.

   * INITIATE and FINALIZE probably need to use PreventTransactionChain()
  and multiple transactions, to avoid holding the ShareUpdateExclusiveLock
  for too long. Also, we want to keep people from using it in the same
  transaction as the loading xact, because they might not realize that
  they would get a concurrency of 1 that way (because of the
  ShareUpdateExclusiveLock).

 Yes.  You need to commit the transaction modifying pg_class so other 
 backends
 can observe the change, at which point you can gather the list to wait on.

 Consider splitting the INITIATE UI into two interfaces, one that transitions
 from state 0 to state 1 and another that expects state 1 and blocks until we
 reach state 2.  You then have no need for PreventTransactionChain(), and the
 interfaces could even be normal functions.  It's less clear how reasonably 
 you
 could do this for the FINALIZE step, given its implicit VACUUM.  It could be
 achieved by having the user do the VACUUM and making the new interface 
 merely
 throw an error if a VACUUM is still needed.  The trivial usage pattern might
 look like this:

 SELECT pg_initiate_load('bigtbl');
 SELECT pg_wait_load('bigtbl'); -- not a great name
 COPY bigtbl FROM STDIN;
 SELECT pg_stop_load('bigtbl');
 VACUUM bigtbl;
 SELECT pg_finalize_load('bigtbl');

 It's definitely less elegant, alas.  Perhaps offer the interface you've
 proposed and have it do the above under the hood.  That way, users with
 complex needs have the flexibility of the lower-level interfaces while those
 who can tolerate PreventTransactionChain() have simplicity.

 I think that's a reasonable suggestion. I am going back and forth a
 little on this one. It's got the benefit that you can see the internal
 states more clearly, and it's easier to tell what's going on, and it's
 better if we want to do more sophisticated testing.

 The main drawback here is that it's exposing more to the user. I
 imagined that we might want to push other kinds of optimizations into
 the load path, and that might upset the interface you've described
 above. Then again, we'll probably need the normal, load, and transition
 states regardless, so maybe it's an empty concern.

 Instead of trying to maintain MVCC semantics, maybe we should just
 have something like COPY (FROZEN) that just writes frozen tuples into
 the table and throws MVCC out the window.  Seems like that would be a
 lot easier to implement and satisfy basically the same use cases.

-1: The situation with hint bit i/o patterns on many workloads is
untenable but it's not safe to assume MVCC can be ditched in those
workloads.  Also, COPY does nothing about deletes.  Neither does the
proposal as stated but I think it's easier to generalize into 'I want
to put hint bits in now so I don't have to deal with them later'.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Tom Lane

Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Jun 12, 2012 at 7:42 AM, Robert Haas robertmh...@gmail.com wrote:
 Instead of trying to maintain MVCC semantics, maybe we should just
 have something like COPY (FROZEN) that just writes frozen tuples into
 the table and throws MVCC out the window.  Seems like that would be a
 lot easier to implement and satisfy basically the same use cases.

 -1: The situation with hint bit i/o patterns on many workloads is
 untenable but it's not safe to assume MVCC can be ditched in those
 workloads.  Also, COPY does nothing about deletes.  Neither does the
 proposal as stated but I think it's easier to generalize into 'I want
 to put hint bits in now so I don't have to deal with them later'.

Dunno, I think Robert's idea has a fair amount of merit: mainly because
it will probably satisfy 90% of use cases for 1% of the work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xml_is_document and selective pg_re_throw

2012-06-12 Thread Tom Lane

Nikhil Sontakke nikkh...@gmail.com writes:
 Consider:

 SELECT xml 'foobar/foobarfoo/bar' IS DOCUMENT;

 And I was looking at xml_is_document() source code. It calls xml_parse
 which throws an error with code set to ERRCODE_INVALID_XML_DOCUMENT. The
 catch block of xml_parse then rethrows.

 Now xml_is_document does a selective rethrow only if the error is not
 ERRCODE_INVALID_XML_DOCUMENT. I can understand that this function does this
 to return true/false, but doesn't this behavior of not propagating the
 error up all the way dangerous? InterruptHoldoffCount inconsistencies for
 instance?

No, I don't see any particular risk there.  The places that might throw
ERRCODE_INVALID_XML_DOCUMENT are sufficiently few (as in, exactly one,
in this usage) that we can have reasonable confidence we know what the
system state is when we catch that error.

 A better way would have been to modify xml_parse to take an additional
 boolean argument to_rethrow and not to rethrow if that is false?

We could do that, but it would greatly complicate xml_parse IMO, since
it still needs its own PG_TRY block to handle other error cases, and
only one of those error cases ought to optionally return failure instead
of re-throwing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 I think Robert's idea has a fair amount of merit: mainly because
 it will probably satisfy 90% of use cases for 1% of the work.
 
+1, especially if we include a pg_dump option to write the COPY
statements with that option.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2012-06-12 Thread Tom Lane

I was reminded today that we still haven't done anything about this:

Tom Lane t...@sss.pgh.pa.us writes:
 While testing 9.1 RPMs on Fedora 15 (2.6.40 kernel), I notice
 messages like these in the kernel log:
 Sep 11 13:38:56 rhl kernel: [  415.308092] postgres (18040): 
 /proc/18040/oom_adj is deprecated, please use /proc/18040/oom_score_adj 
 instead.

At this point there are no shipping Fedora versions that don't emit this
gripe, and F15 is even about to go EOL.

The previous discussion thread at
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00794.php
went off into the weeds of what was in my opinion over-design.
I still think it's sufficient to do what I suggested initially:

 ... The simplest, least risky change that I can think of is to
 copy-and-paste the relevant #ifdef code block in fork_process.c.
 If we do that, then it would be up to the packager whether to #define
 LINUX_OOM_ADJ or LINUX_OOM_SCORE_ADJ or both depending on the behavior
 he wants.

and would like to squeeze that into 9.2 so that we're only a year late
and not two years late in responding to this issue :-(.

Objections?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Merlin Moncure

On Tue, Jun 12, 2012 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Jun 12, 2012 at 7:42 AM, Robert Haas robertmh...@gmail.com wrote:
 Instead of trying to maintain MVCC semantics, maybe we should just
 have something like COPY (FROZEN) that just writes frozen tuples into
 the table and throws MVCC out the window.  Seems like that would be a
 lot easier to implement and satisfy basically the same use cases.

 -1: The situation with hint bit i/o patterns on many workloads is
 untenable but it's not safe to assume MVCC can be ditched in those
 workloads.  Also, COPY does nothing about deletes.  Neither does the
 proposal as stated but I think it's easier to generalize into 'I want
 to put hint bits in now so I don't have to deal with them later'.

 Dunno, I think Robert's idea has a fair amount of merit: mainly because
 it will probably satisfy 90% of use cases for 1% of the work.

90%?  Hint bits i/o issues are not limited to bulk loads.   They apply
to all many-record-per-transaction DML including (and especially)
deletes.  Also it's not safe to assume that insertion heavy clients
can be migrated to COPY.  For example, JDBC bulk loading AFAIK does
not use COPY and even if it did wouldn't be able to decorate the
command for a long time for most production workloads.

Vs Jeff's proposal you have a point -- I'm just very skeptical it's
going to do enough to mitigate the performance hit.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I was reminded today that we still haven't done anything about this:

 Tom Lane t...@sss.pgh.pa.us writes:
 While testing 9.1 RPMs on Fedora 15 (2.6.40 kernel), I notice
 messages like these in the kernel log:
 Sep 11 13:38:56 rhl kernel: [  415.308092] postgres (18040): 
 /proc/18040/oom_adj is deprecated, please use /proc/18040/oom_score_adj 
 instead.

 At this point there are no shipping Fedora versions that don't emit this
 gripe, and F15 is even about to go EOL.

 The previous discussion thread at
 http://archives.postgresql.org/pgsql-hackers/2011-09/msg00794.php
 went off into the weeds of what was in my opinion over-design.
 I still think it's sufficient to do what I suggested initially:

 ... The simplest, least risky change that I can think of is to
 copy-and-paste the relevant #ifdef code block in fork_process.c.
 If we do that, then it would be up to the packager whether to #define
 LINUX_OOM_ADJ or LINUX_OOM_SCORE_ADJ or both depending on the behavior
 he wants.

 and would like to squeeze that into 9.2 so that we're only a year late
 and not two years late in responding to this issue :-(.

 Objections?

I think my position, and the position of the people who responded to
the original thread, was that it seems like you're architecting a
kludge when it wouldn't be that hard to architect a nicer solution.
That having been said, I don't think it's such a large kludge that we
should all have massive dry heaves at the thought of it living in our
repository.  And then too, this isn't the time to be architecting new
9.2 feature anyway.  So I say go for it.  If it makes your life
easier, back-patch it.  Code that requires a #define to enable it
won't break anything for anyone else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Greg Stark

On Tue, Jun 12, 2012 at 5:41 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Also it's not safe to assume that insertion heavy clients
 can be migrated to COPY.  For example, JDBC bulk loading AFAIK does
 not use COPY and even if it did wouldn't be able to decorate the
 command for a long time for most production workloads

A non-mvcc table would only be useful if you're loading the data all
at once and don't plan to load any more. So that wouldn't even be
attempting to cover all insertion heavy clients. Just specific use
cases.

I had in mind to implement something like this for archived data such
as old partitions in partitioned table. I think data loaded in a copy
that you know is never going to be modified would be a reasonable
other case for the same codepath.

Note that this particular use case is covered by FDW as well. If the
source data is in a dense enough format you could save even the load
step by reading directly from the source. I don't think this
eliminates the idea of having a denser read-only format though.
Postgres has lots of other features like indexing and different data
representations that might be useful and would be missing from, say, a
csv source file.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2012-06-12 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 12, 2012 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I still think it's sufficient to do what I suggested initially:
 ... The simplest, least risky change that I can think of is to
 copy-and-paste the relevant #ifdef code block in fork_process.c.

 I think my position, and the position of the people who responded to
 the original thread, was that it seems like you're architecting a
 kludge when it wouldn't be that hard to architect a nicer solution.

I'd be the first to agree it's a kluge.  But the end result of the
previous discussion was that it wasn't so obvious how to architect
a nicer solution.  Nor is there all that much room for people to use a
nicer solution, given that we need help from not just fork_process.c
but the root-privileged startup script (or lately in Fedora it's a
systemd unit script doing the privilege-increasing end of this).

In the short run, if we don't have consensus on this, I'll probably
just carry a Fedora patch like so:

-intfd = open(/proc/self/oom_adj, O_WRONLY, 0);
+intfd = open(/proc/self/oom_score_adj, O_WRONLY, 0);

But it seems a tad silly to be carrying such a patch for a block of
code that is only of interest to Linux packagers anyway, and nearly
all such packagers are facing this same issue either now or in the
very near future.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 12:41 PM, Merlin Moncure mmonc...@gmail.com wrote:
 -1: The situation with hint bit i/o patterns on many workloads is
 untenable but it's not safe to assume MVCC can be ditched in those
 workloads.  Also, COPY does nothing about deletes.  Neither does the
 proposal as stated but I think it's easier to generalize into 'I want
 to put hint bits in now so I don't have to deal with them later'.

 Dunno, I think Robert's idea has a fair amount of merit: mainly because
 it will probably satisfy 90% of use cases for 1% of the work.

 90%?  Hint bits i/o issues are not limited to bulk loads.   They apply
 to all many-record-per-transaction DML including (and especially)
 deletes.  Also it's not safe to assume that insertion heavy clients
 can be migrated to COPY.  For example, JDBC bulk loading AFAIK does
 not use COPY and even if it did wouldn't be able to decorate the
 command for a long time for most production workloads.

 Vs Jeff's proposal you have a point -- I'm just very skeptical it's
 going to do enough to mitigate the performance hit.

I don't think it's going to solve the problem in general, but TBH I
don't think Jeff's proposal is, either.  I mean, ignoring
xmin-committed, xmax-committed, and all-visible bits is going to come
with a pretty steep performance penalty all of its own.  I don't doubt
that you can construct situations in which it's better than incurring
the I/O associated with setting those bits after the fact, but I also
don't doubt that the reverse is true.  Furthermore, any system that
involves sometimes ignoring those things is going to add cost in
extremely hot code paths even when the user doesn't elect to take
advantage of the new feature.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Sachin Srivastava

On Tue, Jun 12, 2012 at 7:43 PM, Dave Page dp...@pgadmin.org wrote:

 On Tue, Jun 12, 2012 at 2:57 PM, Robert Haas robertmh...@gmail.com
 wrote:
  On Tue, Jun 12, 2012 at 8:53 AM, Dave Page dp...@pgadmin.org wrote:
  Oh, I certainly wouldn't do it without *informing* and verifying it
  with the user.
 
  That'll add additional steps for all users, and likely confuse the
  novices even more.
 
  The real issue here is that it's nuts to tell the user please enter
  either a new password or the password for the account that already
  exists, but I'm not telling you which one.

 That's a good point.

  What we need is to display a different dialogue based on the situation.
 
  If the account already exists, we should say Please enter the
  password for the existing postgres account.  If you do not know the
  password, you can reset it using the Windows control panel.
 
  But if it doesn't already exist, we should say The installer will
  create a postgres account on this machine.  Please enter a password
  for the new account.
 
  If we can do that, all of these problems will go away.

 Yeah - that'll require some additional code to check if the account
 exists, but we can probably copy/paste that from the existing utility
 that creates the account (or better yet, refactor it to allow us to
 check or check  create as it does now).

 Ashesh/Sachin/Dharam - do you see any potential issues with that?

Nope.. We do have the code to check whether the user exists or not..


 --
 Dave Page
 Blog: http://pgsnake.blogspot.com
 Twitter: @pgsnake

 EnterpriseDB UK: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Regards,
Sachin Srivastava
EnterpriseDB, India

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 I don't think it's going to solve the problem in general, but TBH I
 don't think Jeff's proposal is, either.  I mean, ignoring
 xmin-committed, xmax-committed, and all-visible bits is going to come
 with a pretty steep performance penalty all of its own.  I don't doubt
 that you can construct situations in which it's better than incurring
 the I/O associated with setting those bits after the fact, but I also
 don't doubt that the reverse is true.  Furthermore, any system that
 involves sometimes ignoring those things is going to add cost in
 extremely hot code paths even when the user doesn't elect to take
 advantage of the new feature.

Yeah; the notion of adding cycles to the visibility-check code paths,
which would add cost even for users who have no use at all for this
feature, is close to being a deal-breaker for me.  I would rather see
us designing this around the idea of what can we do without adding
any new complexity in visibility checks?.

At least for MVCC cases (hence, user tables only), it seems like we
could pre-set XMIN_COMMITTED hint bits if there were a way to be sure
that the XID in question would be seen as still running in the followup
snapshot check that every MVCC visibility test must make anyway.  The
hard part of that seems to be post-crash behavior, but maybe it'd be
acceptable to incur crash-recovery-time cost to run around and unset
such bits?

This doesn't do anything for pre-freezing of course, but I basically
don't see any way that we could pre-freeze without breaking MVCC
semantics anyhow.  So there's still attraction in the idea of offering
users the choice of not sustaining MVCC behavior in exchange for cheaper
bulk loads.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --xlog compatibility break

2012-06-12 Thread Fujii Masao

On Tue, Jun 12, 2012 at 6:28 AM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-10 at 13:43 +0200, Magnus Hagander wrote:
 On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote:
  On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote:
  Yeah, good arguments all around, i agree too :-) Next question is -
  suggestions for naming of said paramter?
 
  --xlog-method=something?  And/or -Xsomething, which would automatically
  enable -x?

 How's this?

 I wouldn't make -x and -X exclusive.  The way I understood this is, -x
 means include xlog, and -X says how to.

And we can specify -X without -x like -Z and -z option of pg_basebackup?
If no, I disagree with you because I don't want to specify two options to
use log streaming method.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-12 Thread Noah Misch

On Mon, Jun 11, 2012 at 05:57:41PM -0400, Alvaro Herrera wrote:
 What about something like this in the root of the tree:
 find . -name \*.pl -o -name \*.pm | xargs perltidy -b -bl -nsfs -naws -l=100 
 -ole=unix
 
 There are files all over the place.  The file that would most be
 affected with one run of this is the ECPG grammar generator.
 
 I checked the -et=4 business (which is basically entab).  We're pretty
 inconsistent about tabs in perl code it seems; some files use tabs
 others use spaces.  Honestly I would just settle on what we use on C
 files, even if the Perl devs don't recommend it because of
 maintainability and portability.  I mean if it works well for us for C
 code, why would it be a problem in Perl code?  However, I don't write
 much of that Perl code myself.

+1 for formatting all our Perl scripts and for including -et=4.  Since that
will rewrite currently-tidy files anyway, this is a good time to audit our
perltidy settings.

Why -l=100 instead of -l=78 like our C code?

perltidy changes this code:

for ($long_variable_name_to_initialize = 0;
 $long_variable_name_to_initialize  $long_limit_variable_name;
 $long_variable_name_to_initialize++)
{

to this:

for (
$long_variable_name_to_initialize = 0;
$long_variable_name_to_initialize  $long_limit_variable_name;
$long_variable_name_to_initialize++
  )
{

Using -vtc=2 removes the new trailing line break.  Additionally using -vt=2
-nsak=for removes the new leading line break, but it also removes the space
between for and (.  Anyone know how to make perltidy format this like we
do in C code?

Why -naws?  I would lean toward -aws -dws -pt=2 to change code like this:

-my $dbi=DBI-connect('DBI:Pg:dbname='.$opt{d});
+my $dbi = DBI-connect('DBI:Pg:dbname=' . $opt{d});

I'd also consider -kbl=2 to preserve runs of blank lines that the author used
to delineate related groups of functions.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-12 Thread Bruce Momjian

On Tue, Jun 12, 2012 at 01:50:48PM -0400, Noah Misch wrote:
 On Mon, Jun 11, 2012 at 05:57:41PM -0400, Alvaro Herrera wrote:
  What about something like this in the root of the tree:
  find . -name \*.pl -o -name \*.pm | xargs perltidy -b -bl -nsfs -naws 
  -l=100 -ole=unix
  
  There are files all over the place.  The file that would most be
  affected with one run of this is the ECPG grammar generator.
  
  I checked the -et=4 business (which is basically entab).  We're pretty
  inconsistent about tabs in perl code it seems; some files use tabs
  others use spaces.  Honestly I would just settle on what we use on C
  files, even if the Perl devs don't recommend it because of
  maintainability and portability.  I mean if it works well for us for C
  code, why would it be a problem in Perl code?  However, I don't write
  much of that Perl code myself.
 
 +1 for formatting all our Perl scripts and for including -et=4.  Since that
 will rewrite currently-tidy files anyway, this is a good time to audit our
 perltidy settings.

OK, another open question is whether we should do any of these changes
now for 9.2/9.3 or wait for 9.3/9.4?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Robert Haas

On Wed, Jun 6, 2012 at 8:42 PM, Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2012-06-06 at 15:08 -0400, Robert Haas wrote:
 On Mon, Jun 4, 2012 at 9:26 PM, Jeff Davis pg...@j-davis.com wrote:
  Thoughts?

 Simon already proposed a way of doing this that doesn't require
 explicit user action, which seems preferable to a method that does
 require explicit user action, even though it's a little harder to
 implement.  His idea was to store the XID of the process creating the
 table in the pg_class row, which I think is *probably* better than
 your idea of having a process that waits and then flips the flag.
 There are some finicky details though - see previous thread for
 discussion of some of the issues.

 My goals include:

 * The ability to load into existing tables with existing data
 * The ability to load concurrently

 My understanding was that the proposal to which you're referring can't
 do those things, which seem like major limitations. Did I miss
 something?

No, you're correct.  I misread your original email, sorry.

I'm just thinking about this a little more.  It strikes me that the
core trade-off here is between doing more post-commit work and doing
more post-abort work.  For example, in your proposal, we've got to run
a lazy vacuum before exiting bulk load mode, because if a transaction
has aborted, we've got to get rid of its tuples before letting anyone
trust hint bits again.  That is, abort cleanup gets harder.  OTOH,
commit cleanup gets easier, because all the hint bits and visibility
map bits are already set: the only thing left is to freeze.  In
general, I think that's a good trade-off.  Commits are much more
common than aborts, and so we ought to be optimizing for the commit
case.

But maybe there are other ways of doing it, besides what you've
proposed here.   Not sure exactly what, but it might be worth thinking
about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 1:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 12, 2012 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I still think it's sufficient to do what I suggested initially:
 ... The simplest, least risky change that I can think of is to
 copy-and-paste the relevant #ifdef code block in fork_process.c.

 I think my position, and the position of the people who responded to
 the original thread, was that it seems like you're architecting a
 kludge when it wouldn't be that hard to architect a nicer solution.

 I'd be the first to agree it's a kluge.  But the end result of the
 previous discussion was that it wasn't so obvious how to architect
 a nicer solution.  Nor is there all that much room for people to use a
 nicer solution, given that we need help from not just fork_process.c
 but the root-privileged startup script (or lately in Fedora it's a
 systemd unit script doing the privilege-increasing end of this).

Well, I don't think a GUC or two would be such a bad way of doing it, but...

 In the short run, if we don't have consensus on this, I'll probably
 just carry a Fedora patch like so:

 -            int        fd = open(/proc/self/oom_adj, O_WRONLY, 0);
 +            int        fd = open(/proc/self/oom_score_adj, O_WRONLY, 0);

 But it seems a tad silly to be carrying such a patch for a block of
 code that is only of interest to Linux packagers anyway, and nearly
 all such packagers are facing this same issue either now or in the
 very near future.

...I also agree with this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 2:02 PM, Bruce Momjian br...@momjian.us wrote:
 On Tue, Jun 12, 2012 at 01:50:48PM -0400, Noah Misch wrote:
 On Mon, Jun 11, 2012 at 05:57:41PM -0400, Alvaro Herrera wrote:
  What about something like this in the root of the tree:
  find . -name \*.pl -o -name \*.pm | xargs perltidy -b -bl -nsfs -naws 
  -l=100 -ole=unix
 
  There are files all over the place.  The file that would most be
  affected with one run of this is the ECPG grammar generator.
 
  I checked the -et=4 business (which is basically entab).  We're pretty
  inconsistent about tabs in perl code it seems; some files use tabs
  others use spaces.  Honestly I would just settle on what we use on C
  files, even if the Perl devs don't recommend it because of
  maintainability and portability.  I mean if it works well for us for C
  code, why would it be a problem in Perl code?  However, I don't write
  much of that Perl code myself.

 +1 for formatting all our Perl scripts and for including -et=4.  Since that
 will rewrite currently-tidy files anyway, this is a good time to audit our
 perltidy settings.

 OK, another open question is whether we should do any of these changes
 now for 9.2/9.3 or wait for 9.3/9.4?

I don't think it matters very much - very few commits touch those perl
scripts.  If we have a consensus, I think it's fine to do it now, or
even after we branch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Tom Lane

Robert Haas rh...@postgresql.org writes:
 Mark JSON error detail messages for translation.
 Per gripe from Tom Lane.

OK, that was one generically bad thing about json.c's error messages.
The other thing that was bothering me was the style of the errdetail
strings, eg

ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(invalid input syntax for type json),
 errdetail(line %d: Invalid escape \\\%s\.,
   lex-line_number, extract_mb_char(s;

I'm not thrilled with the line %d: prefixes.  That seems to me to
violate the letter and spirit of the guideline about making errdetail
messages be complete sentences.  Furthermore, the line number is not
the most important part of the detail message, if indeed it has any
usefulness at all which is debatable.  (It's certainly not going to
be tremendously useful when the error report doesn't show any of the
specific input string being complained of.)

Also, a minority of the errors report the whole input string:

ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(invalid input syntax for type json: \%s\,
lex-input),
 errdetail(The input string ended unexpectedly.)));

And then there are some that provide the whole input string AND a line
number, just in case you thought there was any intentional design
decision there.

A minimum change IMO would be to recast the detail messages as complete
sentences, say The escape sequence \\\%s\ in line %d is invalid.
But I'd like a better-thought-out solution to identifying the location
of the error.

I notice that there's an unfinished attempt to maintain a line_start
pointer; if that were carried through, we could imagine printing the
current line up to the point of an error, which might provide a
reasonable balance between verbosity and insufficient context.
As an example, instead of

regression=# select '{x:1,
z  y:txt1}'::json;
ERROR:  invalid input syntax for type json
LINE 1: select '{x:1,
   ^
DETAIL:  line 2: Token z is invalid.

maybe we could print

regression=# select '{x:1,
z  y:txt1}'::json;
ERROR:  invalid input syntax for type json
LINE 1: select '{x:1,
   ^
DETAIL:  Token z is invalid in line 2: z.

or perhaps better let it run to the end of the line:

regression=# select '{x:1,
z  y:txt1}'::json;
ERROR:  invalid input syntax for type json
LINE 1: select '{x:1,
   ^
DETAIL:  Token z is invalid in line 2: z  y:txt1}.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-12 Thread Robert Haas

On Mon, Jun 11, 2012 at 5:30 AM, Amit Kapila amit.kap...@huawei.com wrote:
 As per the previous discussion in link below, it seems that fallback
 application name needs to be provided for only

 pgbench and oid2name.

 http://archives.postgresql.org/message-id/w2g9837222c1004070216u3bc46b3ahbddfdffdbfb46...@mail.gmail.com



 However the title of Todo Item suggests it needs to be done for dblink as
 well.

 Please suggest if it needs to be done for dblink, if yes then what should be
 fallback_application_name for it?

Why not 'dblink'?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas rh...@postgresql.org writes:
 Mark JSON error detail messages for translation.
 Per gripe from Tom Lane.

 OK, that was one generically bad thing about json.c's error messages.
 The other thing that was bothering me was the style of the errdetail
 strings, eg

                ereport(ERROR,
                        (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                         errmsg(invalid input syntax for type json),
                         errdetail(line %d: Invalid escape \\\%s\.,
                                   lex-line_number, extract_mb_char(s;

 I'm not thrilled with the line %d: prefixes.  That seems to me to
 violate the letter and spirit of the guideline about making errdetail
 messages be complete sentences.  Furthermore, the line number is not
 the most important part of the detail message, if indeed it has any
 usefulness at all which is debatable.  (It's certainly not going to
 be tremendously useful when the error report doesn't show any of the
 specific input string being complained of.)

I am amenable to rephrasing the errdetail to put the line number
elsewhere in the string, but I am strongly opposed to getting rid of
it.  JSON blobs can easily run to dozens or hundreds of lines, and you
will want to know the line number.  Knowing the contents of the line
will in many cases be LESS useful, because the line might contain,
say, a single closing bracket.  That's not gonna help much if it
happens many times in the input value, which is not unlikely.

 Also, a minority of the errors report the whole input string:

        ereport(ERROR,
                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                 errmsg(invalid input syntax for type json: \%s\,
                        lex-input),
                 errdetail(The input string ended unexpectedly.)));

 And then there are some that provide the whole input string AND a line
 number, just in case you thought there was any intentional design
 decision there.

I did try.

 A minimum change IMO would be to recast the detail messages as complete
 sentences, say The escape sequence \\\%s\ in line %d is invalid.
 But I'd like a better-thought-out solution to identifying the location
 of the error.

 I notice that there's an unfinished attempt to maintain a line_start
 pointer; if that were carried through, we could imagine printing the
 current line up to the point of an error, which might provide a
 reasonable balance between verbosity and insufficient context.
 As an example, instead of

 regression=# select '{x:1,
 z  y:txt1}'::json;
 ERROR:  invalid input syntax for type json
 LINE 1: select '{x:1,
               ^
 DETAIL:  line 2: Token z is invalid.

 maybe we could print

 regression=# select '{x:1,
 z  y:txt1}'::json;
 ERROR:  invalid input syntax for type json
 LINE 1: select '{x:1,
               ^
 DETAIL:  Token z is invalid in line 2: z.

 or perhaps better let it run to the end of the line:

 regression=# select '{x:1,
 z  y:txt1}'::json;
 ERROR:  invalid input syntax for type json
 LINE 1: select '{x:1,
               ^
 DETAIL:  Token z is invalid in line 2: z  y:txt1}.

 Thoughts?

I'm not sure I find that an improvement, but I'm open to what other
people think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 11:31 AM, Noah Misch n...@leadboat.com wrote:
  Here's a patch implementing that restriction.  To clarify, I see no need to
  repeat *all* the CREATE-time checks; for example, there's no need to 
  recheck
  permission to use the return type.  The language usage check is enough.

 This seems bizarre and largely unnecessary.  As you stated to begin
 with, granting ownership of a function implies some degree of trust.

 Yes, but I would never expect that level of trust to include access to crash
 the server as a consequence of the function's reliance on STRICT.

+1.  Crashes are bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 12, 2012 at 11:31 AM, Noah Misch n...@leadboat.com wrote:
 This seems bizarre and largely unnecessary.  As you stated to begin
 with, granting ownership of a function implies some degree of trust.

 Yes, but I would never expect that level of trust to include access to crash
 the server as a consequence of the function's reliance on STRICT.

 +1.  Crashes are bad.

C functions, by definition, carry a risk of crashing the server.
I cannot fathom the reasoning why we should consider that granting
ownership of one to an untrustworthy user is ever a good idea, let alone
something we promise to protect you from any bad consequences of.

Even if I accepted that premise, this patch is a pretty bad
implementation of it, because it restricts cases that there is no
reason to think are unsafe.

A less bizarre and considerably more future-proof restriction, IMO,
would simply refuse any attempt to give ownership of a C function
to a non-superuser.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Possible error in psql or Postgres?

2012-06-12 Thread Dusan Misic

Is this normal Postgres / psql behavior?

griffindb=# \d system.user;
  Table system.user
  Column   | Type  | Modifiers

---+---+

 username  | character varying(20) | not null
 password  | character varying(32) | not null
 firstname | character varying(40) | not null default 'nema ime'::character
vary
ing
 lastname  | character varying(40) | not null default 'nema
prezime'::character
varying
Indexes:
SystemUser_PK PRIMARY KEY, btree (username) CLUSTER

normal query:

griffindb=# select * from system.user where username = 'root';
 username | password | firstname |   lastname
--+--+---+---
 root | 1e7db545fccbf4e03abc6b71d329ab4f | Super | administrator
(1 row)

error query:

griffindb=# select * from system.user where user = 'root';
 username | password | firstname | lastname
--+--+---+--
(0 rows)

column user does not exist should throw an error!

PostgreSQL / psql version: 9.1.3 on Windows 7 64-bit

Should Postgres or psql report an error because column used in WHERE clause
does not exist?

Re: [HACKERS] Possible error in psql or Postgres?

2012-06-12 Thread Magnus Hagander

On Tue, Jun 12, 2012 at 9:21 PM, Dusan Misic promi...@gmail.com wrote:
 Is this normal Postgres / psql behavior?

 griffindb=# \d system.user;
   Table system.user
   Column   | Type  | Modifiers

 ---+---+
 
  username  | character varying(20) | not null
  password  | character varying(32) | not null
  firstname | character varying(40) | not null default 'nema ime'::character
 vary
 ing
  lastname  | character varying(40) | not null default 'nema
 prezime'::character
 varying
 Indexes:
     SystemUser_PK PRIMARY KEY, btree (username) CLUSTER

 normal query:

 griffindb=# select * from system.user where username = 'root';
  username | password | firstname |   lastname
 --+--+---+---
  root | 1e7db545fccbf4e03abc6b71d329ab4f | Super | administrator
 (1 row)

 error query:

 griffindb=# select * from system.user where user = 'root';
  username | password | firstname | lastname
 --+--+---+--
 (0 rows)

 column user does not exist should throw an error!

 PostgreSQL / psql version: 9.1.3 on Windows 7 64-bit

 Should Postgres or psql report an error because column used in WHERE clause
 does not exist?

User is not a column, it's a variable:
postgres=# select user;
 current_user
--
 mha
(1 row)


So it's comparing the constant to whatever your are logged in as. If
you log in as root in the database, it will return all rows in the
user table.

So in short, yes,  it's normal.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible error in psql or Postgres?

2012-06-12 Thread David Johnston

On Jun 12, 2012, at 15:21, Dusan Misic promi...@gmail.com wrote:

 Is this normal Postgres / psql behavior?
 
 griffindb=# \d system.user;
   Table system.user
   Column   | Type  | Modifiers
 
 ---+---+
 
  username  | character varying(20) | not null
  password  | character varying(32) | not null
  firstname | character varying(40) | not null default 'nema ime'::character 
 vary
 ing
  lastname  | character varying(40) | not null default 'nema 
 prezime'::character
 varying
 Indexes:
 SystemUser_PK PRIMARY KEY, btree (username) CLUSTER
 
 normal query: 
 
 griffindb=# select * from system.user where username = 'root';
  username | password | firstname |   lastname
 --+--+---+---
  root | 1e7db545fccbf4e03abc6b71d329ab4f | Super | administrator
 (1 row)
 
 error query:
 
 griffindb=# select * from system.user where user = 'root';
  username | password | firstname | lastname
 --+--+---+--
 (0 rows)
 
 column user does not exist should throw an error!
 
 PostgreSQL / psql version: 9.1.3 on Windows 7 64-bit
 
 Should Postgres or psql report an error because column used in WHERE clause 
 does not exist? 

http://www.postgresql.org/docs/9.0/interactive/functions-info.html

user is actually a function the returns the current_user.  It is an SQL 
special function and thus does not require the use of () after the function 
name.  So basically you are saying where current_user = 'root' which is 
either a constant true or false for the statement.

David J.

Re: [HACKERS] Possible error in psql or Postgres?

2012-06-12 Thread Dusan Misic

On 12.6.2012 21:33, David Johnston wrote:

On Jun 12, 2012, at 15:21, Dusan Misic promi...@gmail.com wrote:

Is this normal Postgres / psql behavior?

griffindb=# \d system.user;
  Table system.user
  Column   | Type  | Modifiers

---+---+

 username  | character varying(20) | not null
 password  | character varying(32) | not null
 firstname | character varying(40) | not null default 'nema ime'::character
vary
ing
 lastname  | character varying(40) | not null default 'nema
prezime'::character
varying
Indexes:
SystemUser_PK PRIMARY KEY, btree (username) CLUSTER

normal query:

griffindb=# select * from system.user where username = 'root';
 username | password | firstname |   lastname
--+--+---+---
 root | 1e7db545fccbf4e03abc6b71d329ab4f | Super | administrator
(1 row)

error query:

griffindb=# select * from system.user where user = 'root';
 username | password | firstname | lastname
--+--+---+--
(0 rows)

column user does not exist should throw an error!

PostgreSQL / psql version: 9.1.3 on Windows 7 64-bit

Should Postgres or psql report an error because column used in WHERE clause
does not exist?


http://www.postgresql.org/docs/9.0/interactive/functions-info.html

user is actually a function the returns the current_user.  It is an SQL
special function and thus does not require the use of () after the function
name.  So basically you are saying where current_user = 'root' which is
either a constant true or false for the statement.

David J.

Found the problem. Work USER is a reserved word. It is written in
PostgreSQL documentation. Sorry for false alarm. My bad.

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 A less bizarre and considerably more future-proof restriction,
 IMO, would simply refuse any attempt to give ownership of a C
 function to a non-superuser.
 
We have C replication trigger functions where this would be a bad
thing.  They can't work properly as SECURITY INVOKER, and I see it
as a big step backwards in security to make the only other option
SECURITY DEFINER with a superuser as the owner.  It's not too hard
to come up with other use cases where you want to grant one class of
users rights to do something only through a certain function, not
directly.
 
So there is clearly a need to support ownership of functions,
including C functions, by users who are effectively at an
intermediate level of trust.  We could conceivably use the
database owner for that role, but that seem unnecessarily limiting.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 A less bizarre and considerably more future-proof restriction,
 IMO, would simply refuse any attempt to give ownership of a C
 function to a non-superuser.
 
 We have C replication trigger functions where this would be a bad
 thing.  They can't work properly as SECURITY INVOKER, and I see it
 as a big step backwards in security to make the only other option
 SECURITY DEFINER with a superuser as the owner.

Could you provide more details about that?  If nothing else, this
could be handled with a non-C wrapper function, but I'm not clear
on the generality of the use-case.

 It's not too hard
 to come up with other use cases where you want to grant one class of
 users rights to do something only through a certain function, not
 directly.

Generally I'd imagine that that has something to do with permission
to call the function, not with who owns it.

Basically, if we go down the road Noah is proposing, I foresee a steady
stream of security bugs and ensuing random restrictions on what function
owners can do.  I do not like that future.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 12, 2012 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not thrilled with the line %d: prefixes.  That seems to me to
 violate the letter and spirit of the guideline about making errdetail
 messages be complete sentences.  Furthermore, the line number is not
 the most important part of the detail message, if indeed it has any
 usefulness at all which is debatable.  (It's certainly not going to
 be tremendously useful when the error report doesn't show any of the
 specific input string being complained of.)

 I am amenable to rephrasing the errdetail to put the line number
 elsewhere in the string, but I am strongly opposed to getting rid of
 it.  JSON blobs can easily run to dozens or hundreds of lines, and you
 will want to know the line number.

Agreed, the input strings could be quite large, but in that case we
definitely don't want to be including the whole input in the primary
error message (which is supposed to be short).  I doubt it'd even be
a good idea to try to put it into errdetail; thus I'm thinking about
providing one line only.

 Knowing the contents of the line
 will in many cases be LESS useful, because the line might contain,
 say, a single closing bracket.

True, but I don't think the line number alone is adequate.  There are
too many situations where it's not entirely clear what value is being
complained of.  (Considering only syntax errors thrown from literal
constants in SQL commands is quite misleading about the requirements
here.)  So I think we need to include at least some of the input in
the error.

 I notice that there's an unfinished attempt to maintain a line_start
 pointer; if that were carried through, we could imagine printing the
 current line up to the point of an error, which might provide a
 reasonable balance between verbosity and insufficient context.
 ...
 or perhaps better let it run to the end of the line:

 I'm not sure I find that an improvement, but I'm open to what other
 people think.

Anybody here besides the crickets?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 We have C replication trigger functions where this would be a bad
 thing.  They can't work properly as SECURITY INVOKER, and I see
 it as a big step backwards in security to make the only other
 option SECURITY DEFINER with a superuser as the owner.
 
 Could you provide more details about that?
 
We have a capture_replication_data() trigger function that we attach
to each table which is to be replicated as the first AFTER EACH ROW
trigger for INSERT OR UPDATE OR DELETE.  It records the data
involved in the primitive operation against the row for logical
replication at the row level.  We don't want users to be able to
modify or even view the captured data in the replication tables
except through this function.  (It's actually a bit more complicated
than that because of transaction metadata, but the overall concept
is the same.)
 
We currently use the database owner for the owner of these SECURITY
DEFINER functions, but it seems to me that there could be legitimate
reasons to create a special user with more limited rights than the
database owner in some cases -- just to ensure that a mistake in the
coding of a function doesn't open up an unnecessarily large security
hole.
 
 If nothing else, this could be handled with a non-C wrapper
 function, but I'm not clear on the generality of the use-case.
 
I'm not so sure that this would work for a generalized trigger
function that can be attached to any table like this.
 
 It's not too hard to come up with other use cases where you want
 to grant one class of users rights to do something only through a
 certain function, not directly.
 
 Generally I'd imagine that that has something to do with
 permission to call the function, not with who owns it.
 
Well, it's a matter of fail-safe techniques.  You grant execute
permission for the function to a the role(s) which should be allowed
to do it only through the function.  But do you then necessarily
want the function to execute with unlimited rights, or with the most
restricted set of rights which allows it to perform the intended
function?
 
 Basically, if we go down the road Noah is proposing, I foresee a
 steady stream of security bugs and ensuing random restrictions on
 what function owners can do.  I do not like that future.
 
I do see your point, but I don't like the solution you proposed.
 
As I understand it, the problem Noah is trying to address is that if
we created a replication_manager role for owning these functions,
instead of using the database owner, that role could alter a C
function which isn't coded to handle NULL input to allow it to be
called on NULL input anyway.  Is that right?
 
The first solution which comes to mind for me is to allow a C
function to be compiled with that limitation -- so that *nobody*
could set the wrong option for it.  Then I realize that you already
can test for this in a C function and return NULL if any inputs are
NULL if you want to.  Rather than trying to enforce this in the
ALTER FUNCTION implementation, maybe we should just advise that if
you're going to grant ownership of a C function to a role which
might accidentally or maliciously allow it to be called with NULL
input, the C function should return NULL in that case.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  It's not too hard
  to come up with other use cases where you want to grant one class of
  users rights to do something only through a certain function, not
  directly.
 
 Generally I'd imagine that that has something to do with permission
 to call the function, not with who owns it.

What I believe Kevin is getting at here is this:

There's no way to say run this function as user X except by making it
SECURITY DEFINER and owned by the user you want the function to run as.

I believe everyone agrees that only a superuser should be allowed
CHANGE a C-language function.  Unfortunately, being the 'OWNER' conveys
more than just the ability to change the function.

If we had an independent way to have the function run as a specific
user, where that user DIDN'T own the function, I think Kevin's use case
would be satisfied.

I'm not sure if it's be reasonable for a C-language function to just go
ahead and decide to change the user it's running as in the database..
I have to admit that I don't think I've ever tried to do that.

 Basically, if we go down the road Noah is proposing, I foresee a steady
 stream of security bugs and ensuing random restrictions on what function
 owners can do.  I do not like that future.

I agree that we don't want to have to police what a function owner can
do to a function, and therefore untrusted language functions should only
be owned by superusers, but I feel that means we need to look at what
function ownership currently implies and allows and consider if those
operations should be broken out and made independently grantable.  When
it comes to 'SECURITY DEFINER' and it's relationship to 'OWNER', I think
we have to provide some kind of solution that doesn't require those
functions to be run as superuser simply because the function has to be
owned by a superuser.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
 What I believe Kevin is getting at here is this:

 There's no way to say run this function as user X except by making it
 SECURITY DEFINER and owned by the user you want the function to run as.

 If we had an independent way to have the function run as a specific
 user, where that user DIDN'T own the function, I think Kevin's use case
 would be satisfied.

Interesting thought.  I'm not exactly sure who should be allowed to
apply the RUN AS other-user option to a function, but I can see the
possible value of separating the right to modify the function's
definition from the user the function runs as.  Kevin, does this seem
like it would address your concern?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Kevin Grittner

Stephen Frost sfr...@snowman.net wrote: 
 
 If we had an independent way to have the function run as a
 specific user, where that user DIDN'T own the function, I think
 Kevin's use case would be satisfied.
 
I agree.  I'm not sure quite what that would look like, but maybe
SECURITY ROLE rolename or some such could be an alternative to
SECURITY INVOKER and SECURITY DEFINER.  (I haven't looked to see
what the standard has here.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Alvaro Herrera


Excerpts from Tom Lane's message of mar jun 12 16:52:20 -0400 2012:
 Robert Haas robertmh...@gmail.com writes:

  I notice that there's an unfinished attempt to maintain a line_start
  pointer; if that were carried through, we could imagine printing the
  current line up to the point of an error, which might provide a
  reasonable balance between verbosity and insufficient context.
  ...
  or perhaps better let it run to the end of the line:
 
  I'm not sure I find that an improvement, but I'm open to what other
  people think.
 
 Anybody here besides the crickets?

I think providing both partial line contents (so +1 for maintaining
line_start carefully as required) and line number would be useful enough
to track down problems.

I am not sure about the idea of letting the detail run to the end of the
line; that would be problematic should the line be long (there might not
be newlines in the literal at all, which is not that unusual).  I think
it should be truncated at, say, 76 chars or so.

For the case where you have a single } in a line, this isn't all that
helpful; we could consider printing the previous line as well.  But if
you end up with

   }
}

then it's not that helpful either.  I am not sure.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Could you provide more details about that?
 
 We have a capture_replication_data() trigger function that we attach
 to each table which is to be replicated as the first AFTER EACH ROW
 trigger for INSERT OR UPDATE OR DELETE.  It records the data
 involved in the primitive operation against the row for logical
 replication at the row level.  We don't want users to be able to
 modify or even view the captured data in the replication tables
 except through this function.  (It's actually a bit more complicated
 than that because of transaction metadata, but the overall concept
 is the same.)
 
 We currently use the database owner for the owner of these SECURITY
 DEFINER functions, but it seems to me that there could be legitimate
 reasons to create a special user with more limited rights than the
 database owner in some cases -- just to ensure that a mistake in the
 coding of a function doesn't open up an unnecessarily large security
 hole.

I'm not entirely following here.  If the function is coded in C, then
it can pretty much do what it pleases independently of what user ID
is thought to be executing it at the SQL level.  That would really only
matter if you were doing some SQL stuff via the SPI interface --- and
if that's the case, couldn't the C function set the appropriate role
to use for itself, anyway?  (In other words, it's not that hard to build
a RUN AS other-user feature into a C function, even without any support
from the rest of the system.)

 Rather than trying to enforce this in the
 ALTER FUNCTION implementation, maybe we should just advise that if
 you're going to grant ownership of a C function to a role which
 might accidentally or maliciously allow it to be called with NULL
 input, the C function should return NULL in that case.

Yeah, the just-code-defensively option is worth considering too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Alvaro Herrera


Excerpts from Kevin Grittner's message of mar jun 12 17:08:09 -0400 2012:
 Stephen Frost sfr...@snowman.net wrote: 
  
  If we had an independent way to have the function run as a
  specific user, where that user DIDN'T own the function, I think
  Kevin's use case would be satisfied.
  
 I agree.  I'm not sure quite what that would look like, but maybe
 SECURITY ROLE rolename or some such could be an alternative to
 SECURITY INVOKER and SECURITY DEFINER.  (I haven't looked to see
 what the standard has here.)

We talked briefly about this a year ago:
http://wiki.postgresql.org/wiki/PgCon_2011_Developer_Meeting#Authorization_Issues
(Not quite the same thing, but it's closely related).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I'm not exactly sure who should be allowed to
 apply the RUN AS other-user option to a function, but I can see the
 possible value of separating the right to modify the function's
 definition from the user the function runs as.

When it comes to 'who can set it'- my first reaction is the owner.
The next question is- what rights does the owner have to have on the
other-user role, and I would suggest membership.  This could be
extremely useful for non-C functions as well, consider this:

I'm Bob.  I have an 'audit' role which is granted to me.  I'd like to
create a function that runs as 'audit' (which has various rights granted
to it which are less than the rights of 'Bob'), but which only I can
modify.  If I've been granted the 'audit' role, then I can create a
function which is owned by 'audit' (set role audit; create function
...), and I could make it security definer, therefore I should be able
to create a function which is owned by me and runs as 'audit'.

Writing this a bit off-the-cuff, so apologies if there are obvious flaws
in this logic. :)

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 (In other words, it's not that hard to build
 a RUN AS other-user feature into a C function, even without any support
 from the rest of the system.)

I was considering this and a bit concerned about what would happen if
the C function actually did this and if we'd clean things up properly at
the end or if the function would be required to handle that clean-up
(if it was written as SECUURITY INVOKER, which is what's being suggested
here)...

In general, I'd certainly rather have the database handle that cleanly
and consistently than expect my function to clean up after itself.

Alvaro's point about the discussion of a stack of roles is certainly
something else to consider, though I feel that the 'run-as' option is
pretty straight-forward and could be done more-or-less identically to
how we do secuirty definer now, it's just changing where we get the role
to change to before running the function.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 I am not sure about the idea of letting the detail run to the end of the
 line; that would be problematic should the line be long (there might not
 be newlines in the literal at all, which is not that unusual).  I think
 it should be truncated at, say, 76 chars or so.

Yeah, I was wondering about trying to provide a given amount of context
instead of fixing it to one line.  We could do something like

(1) back up N characters;
(2) find the next newline, if there is one at least M characters before
the error point;
(3) print from there to the error point.

This would give between M and N characters of context, except when the
error point is less than M characters from the start of the input.  Not
sure how to display such text together with a line number though; with a
multi-line fragment it would not be obvious which part the line number
refers to.  (Backing up over multibyte data might be a bit complicated
too, but I assume we can think of something workable for that.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm not entirely following here.  If the function is coded in C,
 then it can pretty much do what it pleases independently of what
 user ID is thought to be executing it at the SQL level.  That
 would really only matter if you were doing some SQL stuff via the
 SPI interface
 
Yeah, there is a lot of SPI through cached prepared statements.
 
 if that's the case, couldn't the C function set the appropriate
 role to use for itself, anyway?
 
I suppose it could, though I never really thought about that aspect
of the issue before.
 
 (In other words, it's not that hard to build a RUN AS other-user
 feature into a C function, even without any support from the rest
 of the system.)
 
Similar to what Stephen says in his post that came through while I
was typing this, I would be less comfortable with this being a
hand-rolled feature of individual C functions than having some more
systematic way to handle it.  Even if there is the possibility that
someone could subvert the more systematic way of doing things with
clever C code, I think the systematic approach reduces risk from
error and would tend to make hostile code in C functions stand out
more.  And having the information at the catalog level would make
the intended usages easier to manage.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-12 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 (In other words, it's not that hard to build
 a RUN AS other-user feature into a C function, even without any support
 from the rest of the system.)

 I was considering this and a bit concerned about what would happen if
 the C function actually did this and if we'd clean things up properly at
 the end or if the function would be required to handle that clean-up
 (if it was written as SECUURITY INVOKER, which is what's being suggested
 here)...

It would have to remember to restore the previous role on normal exit,
but I believe that the system would take care of cleanup if an error is
thrown.  Looking at fmgr_security_definer, there are just a couple of
lines involved with changing the active role.  (There's a boatload of
*other* crap that's been shoved into that function over time, but the
part of it that actually does what it's named for is pretty darn small.)

 Alvaro's point about the discussion of a stack of roles is certainly
 something else to consider, though I feel that the 'run-as' option is
 pretty straight-forward and could be done more-or-less identically to
 how we do secuirty definer now, it's just changing where we get the role
 to change to before running the function.

Yes, it would surely not be much more code, just a bit added here:

if (procedureStruct-prosecdef)
fcache-userid = procedureStruct-proowner;

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ability to listen on two unix sockets

2012-06-12 Thread Bruce Momjian

On Wed, Jun 06, 2012 at 10:38:42AM -0400, Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
  Couldn't you simply tell postgres to put it's socket in, say, /var/run, and 
  create a symlink to that socket in the global /tmp directory?
 
 FYI, this proposal emerged out of a discussion between Honza and
 myself.  Use a symlink was my first idea too, but on reflection
 it seems like it will take less new code to support two sockets.
 We already support multiple TCP sockets, so multiple Unix sockets
  ---
 shouldn't be that much extra trouble.

We do?  I didn't think listening on multiple interfaces meant we
listened on multiple sockets.  Is there something else?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 08:42 -0400, Robert Haas wrote:
 Instead of trying to maintain MVCC semantics, maybe we should just
 have something like COPY (FROZEN) that just writes frozen tuples into
 the table and throws MVCC out the window.  Seems like that would be a
 lot easier to implement and satisfy basically the same use cases.

The reason that doesn't work is because there's really no answer for
ABORTs except to delete the whole table, or ask the user to try to
figure out what made it or what didn't. (That I can see, anyway; you may
have a better idea.)

In practice, I think the user would need to use this only for new tables
and use only one loader so they would know what to do if there is an
abort. That makes this very similar to the proposal for optimizing loads
by the transaction that creates the table; except with fewer safeguards.
It's basically just saying don't abort, and be careful with concurrent
reads (which are now dirty).

I think this problem largely has to do with freezing though, because
that's what makes it impossible to differentiate after an abort. Let's
set that aside for now, and just focus on setting HEAP_XMIN_COMMITTED,
PD_ALL_VISIBLE, and the VM bit. Those account for a major part of the
problem; being able to freeze also is really a bonus (and if the table
is really that static, then maybe creating it in the loading transaction
is reasonable).

In those terms, we can reframe the questions as: what do we do about
reads during the load?

Answers could include:
  (a) nothing -- reads during a load are potentially dirty
  (b) readers ignore hint bits during the load, and pay the penalty
  (c) allow only INSERT/COPY, and block unsafe SELECT/UPDATE/DELETE

(a) is your proposal except without the freezing. (b) was my proposal.
(c) was a backup I had in mind (if reads are expensive anyway, maybe
people would rather just block).

These aren't exclusive. Maybe we can do (c) in READ COMMITTED and above,
and (a) in READ UNCOMMITTED.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Kevin Grittner

Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I'm working on getting all of our triggers to behave with Tom's v8
 patch for bug 6123 and hope to be able to post a positive result
 tomorrow.  I think this is considered a bug and still subject to
 inclusion, but it doesn't really cause my shop any pain if it is
 bumped to 9.3.  In other words, I don't think this is a blocker.
 
Testing has run into problems, the cause of which is not immediately
obvious.  I think we should bump this to 9.3.  Our shop has a
workaround which isn't drawing any complaints here, and the issue
has been around forever in its current form.  I'm not even sure we
won't need more discussion on what constitutes correct behavior once
I track things down.
 
Any objections?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ability to listen on two unix sockets

2012-06-12 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 On Wed, Jun 06, 2012 at 10:38:42AM -0400, Tom Lane wrote:
 We already support multiple TCP sockets, so multiple Unix sockets
 shouldn't be that much extra trouble.

 We do?  I didn't think listening on multiple interfaces meant we
 listened on multiple sockets.  Is there something else?

There's one socket for each entry in the listen_addresses list,
plus one for the Unix socket.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 13:31 -0400, Tom Lane wrote:
 Yeah; the notion of adding cycles to the visibility-check code paths,
 which would add cost even for users who have no use at all for this
 feature, is close to being a deal-breaker for me.  I would rather see
 us designing this around the idea of what can we do without adding
 any new complexity in visibility checks?.

[waves hands wildly]

We could create a different HeapTupleSatisfiesMVCC routine that ignores
hint bits, and select that at the time the scan is started. That would
at least avoid affecting existing code paths.

 At least for MVCC cases (hence, user tables only), it seems like we
 could pre-set XMIN_COMMITTED hint bits if there were a way to be sure
 that the XID in question would be seen as still running in the followup
 snapshot check that every MVCC visibility test must make anyway.  The
 hard part of that seems to be post-crash behavior, but maybe it'd be
 acceptable to incur crash-recovery-time cost to run around and unset
 such bits?

Hmm... but that still leaves PD_ALL_VISIBLE, which will also cause a
rewrite of the data. Under OLTP, we can assume that PD_ALL_VISIBLE is
much more rare than HEAP_XMIN_COMMITTED; but for bulk loads, setting
HEAP_XMIN_COMMITTED doesn't help us much without PD_ALL_VISIBLE.

 This doesn't do anything for pre-freezing of course, but I basically
 don't see any way that we could pre-freeze without breaking MVCC
 semantics anyhow.  So there's still attraction in the idea of offering
 users the choice of not sustaining MVCC behavior in exchange for cheaper
 bulk loads.

That may be reasonable, but it's much more dangerous than just breaking
MVCC (which to me implies isolation rules) -- pre-freezing would break
atomicity if you have any aborts. And I can't think of very many cases
where losing atomicity is reasonable (although I'm sure there are some).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Bruce Momjian

On Tue, Jun 12, 2012 at 01:31:05PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I don't think it's going to solve the problem in general, but TBH I
  don't think Jeff's proposal is, either.  I mean, ignoring
  xmin-committed, xmax-committed, and all-visible bits is going to come
  with a pretty steep performance penalty all of its own.  I don't doubt
  that you can construct situations in which it's better than incurring
  the I/O associated with setting those bits after the fact, but I also
  don't doubt that the reverse is true.  Furthermore, any system that
  involves sometimes ignoring those things is going to add cost in
  extremely hot code paths even when the user doesn't elect to take
  advantage of the new feature.
 
 Yeah; the notion of adding cycles to the visibility-check code paths,
 which would add cost even for users who have no use at all for this
 feature, is close to being a deal-breaker for me.  I would rather see
 us designing this around the idea of what can we do without adding
 any new complexity in visibility checks?.
 
 At least for MVCC cases (hence, user tables only), it seems like we
 could pre-set XMIN_COMMITTED hint bits if there were a way to be sure
 that the XID in question would be seen as still running in the followup
 snapshot check that every MVCC visibility test must make anyway.  The
 hard part of that seems to be post-crash behavior, but maybe it'd be
 acceptable to incur crash-recovery-time cost to run around and unset
 such bits?

Well, truncating tables that were empty on the load would certainly be
a easy to do --- not sure if that helps us, though.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 13:10 -0400, Robert Haas wrote:
 I don't think it's going to solve the problem in general, but TBH I
 don't think Jeff's proposal is, either.  I mean, ignoring
 xmin-committed, xmax-committed, and all-visible bits is going to come
 with a pretty steep performance penalty all of its own.

I think we'd certainly have to discourage users from launching lots of
full table scans during a data load. We could go so far as blocking
reads, as I said in my other response, and still (mostly) meet my
primary goals.

But allowing reads without hint bits might be useful for highly
selective index access, which is a substantial use case for the kind of
large tables we'd be bulk-loading.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ability to listen on two unix sockets

2012-06-12 Thread Bruce Momjian

On Tue, Jun 12, 2012 at 05:48:58PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Wed, Jun 06, 2012 at 10:38:42AM -0400, Tom Lane wrote:
  We already support multiple TCP sockets, so multiple Unix sockets
  shouldn't be that much extra trouble.
 
  We do?  I didn't think listening on multiple interfaces meant we
  listened on multiple sockets.  Is there something else?
 
 There's one socket for each entry in the listen_addresses list,
 plus one for the Unix socket.

Oh, how do we handle '*'?  We pass that to the kernel, I assume.  Shame
there is wildcard ability for unix domain sockets, which would use any
directory --- guess that wouldn't work out well.  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Tom Lane

Jeff Davis pg...@j-davis.com writes:
 In those terms, we can reframe the questions as: what do we do about
 reads during the load?

 Answers could include:
   (a) nothing -- reads during a load are potentially dirty
   (b) readers ignore hint bits during the load, and pay the penalty
   (c) allow only INSERT/COPY, and block unsafe SELECT/UPDATE/DELETE

Or (d) it's not a problem, since the inserting XID is still busy
according to the readers' snapshots.

The part I think is actually hard is how to clean up if the inserting
xact doesn't reach commit.  I think what we're basically looking at here
is pushing more cost into that path in order to avoid cost in successful
cases.  The first design that comes to mind is

(1) the inserting xact remembers which tables it's inserted pre-hinted
tuples into, and if it has to abort, it first seqscans those tables to
reset the hint bits;
(2) it also emits WAL entries that will cause crash recovery to perform
an identical scan, if WAL ends without finding a commit or abort record
for the inserting xact.

But there may be other better ways.  One problem with the above sketch
is that you can't checkpoint till the insertion is committed, since a
checkpoint would prevent crash recovery from seeing the warning WAL
records.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 17:53 -0400, Bruce Momjian wrote:
 Well, truncating tables that were empty on the load would certainly be
 a easy to do --- not sure if that helps us, though.

This goes back to the single-loader-into-an-empty-table case, which I'd
like to get away from. I think we already have some reasonable proposals
on the table for optimizing that case.

Ultimately, that pushes us toward partitioning with the same granularity
as the data load. We can try to open that discussion up, but I think
it's a bad idea to tie the data load granularity to the partitioning
granularity, unless we have a way to coalesce them later without
rewriting. It's too bad filesystems don't allow us to just reallocate
some blocks from one file to another. [ OK, I guess I just opened this
discussion myself ;) ]

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 final

2012-06-12 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 I'm working on getting all of our triggers to behave with Tom's v8
 patch for bug 6123 and hope to be able to post a positive result
 tomorrow.  I think this is considered a bug and still subject to
 inclusion, but it doesn't really cause my shop any pain if it is
 bumped to 9.3.  In other words, I don't think this is a blocker.
 
 Testing has run into problems, the cause of which is not immediately
 obvious.  I think we should bump this to 9.3.  Our shop has a
 workaround which isn't drawing any complaints here, and the issue
 has been around forever in its current form.  I'm not even sure we
 won't need more discussion on what constitutes correct behavior once
 I track things down.

Agreed.  Even if we were entirely happy with the design of the patch
(which, from the previous discussion, we weren't 100%) and your testing
gave it a clean bill of health, it's uncomfortable to be pushing such a
change into 9.2 post-beta --- it might invalidate other peoples'
application compatibility checking, which I'm sure people have started
doing using the betas.  Punting to 9.3 seems like the thing to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 18:02 -0400, Tom Lane wrote:
 Or (d) it's not a problem, since the inserting XID is still busy
 according to the readers' snapshots.

How much of a savings did we get from PD_ALL_VISIBLE when it was added
into the page-at-a-time visibility check?

From 608195a3a3656145a7eec7a47d903bc684011d73:

In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
each heap page, also indicating that all tuples on the page are visible
to all transactions. It's important that this flag is kept up-to-date.
It is also used to skip visibility tests in sequential scans, which
gives a small performance gain on seqscans.

If small means that it's something we can give up, then focusing on
HEAP_XMIN_COMMITTED makes sense. But if we can't give it up, then we
need to take it into account in the proposal.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] hint bit i/o reduction

2012-06-12 Thread Merlin Moncure

hackers,

A while back I made an effort implementing a backend local hint bit
cache with the intention of mitigating i/o impact for scan heavy
workloads that involve moving a lot of records around per transaction.
 The basic concept was to keep some backend private memory that
tracked the resolution of recently seen transactions and worked in a
similar fashion to the hint bits:

if (!commit_hint_bit_set  hint_bit_cache(xid) == committed)
{

The other interesting aspect was that, if the bit was found in the
cache, the bit was set on the tuple but the page was not dirtied.  If
the xid was not found in the cache, the bit was set and the page was
dirtied normally.  From a strictly performance standpoint, limited
testing seemed to indicate it more or less fixed the hint bit i/o
problems.  Still, injecting extra logic (especially a cache lookup)
into the visibility routines is not to be taken lightly, and maybe
there's a better way.  There simply has to be a way to ameliorate hint
bit i/o without spending a lot of cycles and hopefully not too much
impacting other workloads.  Unfortunately, assuming the CRC patch
makes it in, any of the branch of tricks in the line of 'set the bit
but avoid dirtying the page' aren't going to fly.  I think, at the end
of the day, we have to avoid setting the bit, but only in cases where
we are fairly sure we would prefer not to do that.

I'm thinking something fairly simple would get some good mileage:

1) Keep track # times the last transaction id was repeatedly seen in
tqual.c (resetting as soon as a new xid was touched.  We can do this
just for xmin, or separately for both xmin and xmax.
2) right after checking the hint bit (assuming it wasn't set), test to
see if our xid is the 'last xid', and iff #times_seen_same 
#some_number (say, 100 or 1000), use it as if the hint bit was set.
#some_number can be chosen to some fairly conservative defined value,
or perhaps we can be sneaky and try and adjust it based on things like
how many unforced clog faults we're doing -- maybe even keeping some
accounting at the page level.  We can also try to be smart and disable
the 'avoid setting the hint bit' logic if the page is already dirty.

The presumption here is that if you're seeing the same xid over and
over, there simply isn't a lot of value in recording it in page after
page after page.  It's tempting to suggest that there is already an
'is this xid the same as the last one AKA last_transaction_id' at the
clog visibility checking level, but that doesn't help for the
following reasons:

1) while it does avoid the jump out to clog, it occurs at the wrong
place in the sequence of visibility checking (after you've wasted
cycles in TransactionIdIsInProgress() etc) and
2) by being in the clog level can't influence visibility behaviors of
whether or not to set the bit.
3) it's not inline

I see two potential downsides here:
1) maybe the extra logic in visibility is too expensive (but i doubt it)
2) you're missing early opportunities to set the all visible bit which
in turn will influence index only scans.

The above might be a small price to pay for saving all the i/o and
sooner or later if the records are long lived vacuum will swing around
and tag them.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Craig Ringer


On 06/12/2012 08:48 PM, Dave Page wrote:
I'm not keen on adding additional user accounts - that's a security 
problem imho.
It's also an issue for add-ons like PgAgent that aren't necessarily tied 
to one exact version of Pg.

That makes sense.  I just think we should try very hard to make the
installer just work to the extent possible, rather than trying to
direct the user in how to use system tools in the middle of the
process.

Right - that's what always aim to do (and in fact was the number one
driver behind the current generation of installers), and provided the
user remembers their password it works just fine.
Users don't remember passwords, though. It's one of those constants, and 
is why practically every web site etc out there offers password recovery.


The installer IMO needs to store the postgres account password in a 
registry key with permissions set so that only users with local admin 
rights (ie: who can use the installer) can view it. I don't like the 
idea of storing a password, but it's only going to be accessible if you 
already have rights to the registry as local admin, in which case the 
attacker can just reset it themselves (or root your machine). So long as 
they installer warns that the password shouldn't be one you use 
elsewhere because it can be recovered from your computer, I don't see a 
problem.---


--
Craig Ringer




POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088 Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Craig Ringer


On 06/12/2012 08:08 PM, Dave Page wrote:

Some background: By default the installer will use 'postgres' for both
the service (OS) account, and the database superuser account. It will
use the same password for both (though, users have complete control at
the command line if they want it, which is why the account names are
shown in the installer). Unlike *nix installations, the service
account must have a password, which is required during both
installation and upgrade to configure the PostgreSQL service. We
therefore ask for the password during both installation and upgrade.
If the user account exists (which can happen if there has previously
been an installation of Postgres on the system), the user must specify
the existing password for the account during installation (and of
course, during all upgrades). This is where users normally get stuck,
as they've forgotten the password they set in the past.
They may not even have set it, because the EnterpriseDB installer can be 
run unattended and may've been used by some 3rd party package.


I'd be interested in seeing what Microsoft installers that create 
isolated user accounts do. I think .NET creates one for ASP, so that'd 
be one option to look into.


--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088 Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Tom Lane

I wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 I am not sure about the idea of letting the detail run to the end of the
 line; that would be problematic should the line be long (there might not
 be newlines in the literal at all, which is not that unusual).  I think
 it should be truncated at, say, 76 chars or so.

 Yeah, I was wondering about trying to provide a given amount of context
 instead of fixing it to one line.  We could do something like
 (1) back up N characters;
 (2) find the next newline, if there is one at least M characters before
 the error point;
 (3) print from there to the error point.

After experimenting with this for awhile I concluded that the above is
overcomplicated, and that we might as well just print up to N characters
of context; in most input, the line breaks are far enough apart that
preferentially breaking at them just leads to not having very much
context.  Also, it seems like it might be a good idea to present the
input as a CONTEXT line, because that provides more space; you can fit
50 or so characters of data without overrunning standard display width.
This gives me output like

regression=# select 
'{unique1:8800,unique2:0,two:0,four:0,ten:0,twenty:0,hundred:0,thousand:800,
twothous
and:800,fivethous:3800,tenthous:8800,odd:0,even:1,
stringu1:MA,stringu2:AA,string4:xx}'
::json;
ERROR:  invalid input syntax for type json
LINE 1: select '{unique1:8800,unique2:0,two:0,four:0,ten:0...
   ^
DETAIL:  Character with value 0x0a must be escaped.
CONTEXT:  JSON data, line 1: ...twenty:0,hundred:0,thousand:800,
twothous

regression=# 

I can't give too many examples because I've only bothered to context-ify
this single error case as yet ;-) ... but does this seem like a sane way
to go?

The code for this is as attached.  Note that I'd rip out the normal-path
tracking of line boundaries; it seems better to have a second scan of
the data in the error case and save the cycles in non-error cases.


...
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(invalid input syntax for type json),
 errdetail(Character with value \0x%02x\ must be 
escaped.,
   (unsigned char) *s),
 report_json_context(lex)));
...


/*
 * Report a CONTEXT line for bogus JSON input.
 *
 * The return value isn't meaningful, but we make it non-void so that this
 * can be invoked inside ereport().
 */
static int
report_json_context(JsonLexContext *lex)
{
char   *context_start;
char   *context_end;
char   *line_start;
int line_number;
char   *ctxt;
int ctxtlen;

/* Choose boundaries for the part of the input we will display */
context_start = lex-input;
context_end = lex-token_terminator;
line_start = context_start;
line_number = 1;
for (;;)
{
/* Always advance over a newline, unless it's the current token */
if (*context_start == '\n'  context_start  lex-token_start)
{
context_start++;
line_start = context_start;
line_number++;
continue;
}
/* Otherwise, done as soon as we are close enough to context_end */
if (context_end - context_start  50)
break;
/* Advance to next multibyte character */
if (IS_HIGHBIT_SET(*context_start))
context_start += pg_mblen(context_start);
else
context_start++;
}

/* Get a null-terminated copy of the data to present */
ctxtlen = context_end - context_start;
ctxt = palloc(ctxtlen + 1);
memcpy(ctxt, context_start, ctxtlen);
ctxt[ctxtlen] = '\0';

return errcontext(JSON data, line %d: %s%s,
  line_number,
  (context_start  line_start) ? ... : ,
  ctxt);
}


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 6:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The part I think is actually hard is how to clean up if the inserting
 xact doesn't reach commit.  I think what we're basically looking at here
 is pushing more cost into that path in order to avoid cost in successful
 cases.  The first design that comes to mind is

 (1) the inserting xact remembers which tables it's inserted pre-hinted
 tuples into, and if it has to abort, it first seqscans those tables to
 reset the hint bits;

I don't think we can count on that to be safe in an arbitrarily chosen
abort path.  Anything FATAL, for instance.  I think we're going to
need to keep track of some kind table-xmin value, representing the
oldest operation on the table that's not cleaned up yet, and make it
autovacuum's job to clean any that precede OldestXmin.  If the backend
can clean itself up, great, but there has to be some kind of allowance
for the case where that doesn't happen.

I'm also skeptical about the notion that scan the whole table is
going to be a good idea.  It really will have to be a full sequential
scan, if we're setting visibility map bits as we go, not just a scan
of pages that are not-all-visible, as vacuum normally does.  I think
if we want to go this route, we need to log the TID of every tuple we
write into the heap into some kind of undo fork (or maybe just the
block numbers), so that if the transaction aborts, we (or autovacuum)
can go back and find all of those TIDs and mark the tuples dead
without having to scan through (potentially) terabytes of data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 6:26 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-06-12 at 18:02 -0400, Tom Lane wrote:
 Or (d) it's not a problem, since the inserting XID is still busy
 according to the readers' snapshots.

 How much of a savings did we get from PD_ALL_VISIBLE when it was added
 into the page-at-a-time visibility check?

 From 608195a3a3656145a7eec7a47d903bc684011d73:

 In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
 each heap page, also indicating that all tuples on the page are visible
 to all transactions. It's important that this flag is kept up-to-date.
 It is also used to skip visibility tests in sequential scans, which
 gives a small performance gain on seqscans.

 If small means that it's something we can give up, then focusing on
 HEAP_XMIN_COMMITTED makes sense. But if we can't give it up, then we
 need to take it into account in the proposal.

It's significant.

rhaas=# create table foo (a int, b text);
ERROR:  relation foo already exists
rhaas=# create table bar (a int, b text);
CREATE TABLE
rhaas=# insert into bar select g,
random()::text||random()::text||random()::text||random()::text from
generate_series(1,100) g;
INSERT 0 100
rhaas=# \timing
Timing is on.
rhaas=# select sum(1) from bar;   sum
-
 100
(1 row)

Time: 257.500 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 140.763 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 142.760 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 140.603 ms
rhaas=# vacuum bar;
VACUUM
Time: 133.084 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 123.591 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 123.096 ms
rhaas=# select sum(1) from bar;
   sum
-
 100
(1 row)

Time: 122.653 ms

So vacuuming to set the PD_ALL_VISIBLE bits is buying us more than 10% here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

2012-06-12 Thread Robert Haas

On Tue, Jun 12, 2012 at 9:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The code for this is as attached.  Note that I'd rip out the normal-path
 tracking of line boundaries; it seems better to have a second scan of
 the data in the error case and save the cycles in non-error cases.

Really?!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-12 Thread Craig Ringer


On 06/13/2012 01:19 AM, Sachin Srivastava wrote:


On Tue, Jun 12, 2012 at 7:43 PM, Dave Page dp...@pgadmin.org
mailto:dp...@pgadmin.org wrote:

On Tue, Jun 12, 2012 at 2:57 PM, Robert Haas robertmh...@gmail.com
mailto:robertmh...@gmail.com wrote:



  What we need is to display a different dialogue based on the
situation.
 
  If the account already exists, we should say Please enter the
  password for the existing postgres account.  If you do not know the
  password, you can reset it using the Windows control panel.


Why using the windows control panel ?

They're running an installer with the rights to create/alter/delete 
users. Shouldn't the installer just offer to reset the postgres 
password, after warning them that it'll break other versions of 
PostgreSQL and tools like PgAgent?


IMO, it'd be better for the installer to just take care of this behind 
the scenes. Generate a random password. Store it in the registry in a 
key that only the services manager ( SYSTEM account? ) and local 
administrators can read. Use it in subsequent installs. Make the 
postgres database password completely unrelated to it.


--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088 Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 16-bit page checksums for 9.2

2012-06-12 Thread Jeff Davis

On Sun, 2012-02-19 at 21:49 +, Simon Riggs wrote:
 On Thu, Feb 16, 2012 at 11:16 AM, Simon Riggs si...@2ndquadrant.com wrote:
 
  v8 attached
 
 v10 attached.
 
 This patch covers all the valid concerns discussed and has been
 extensively tested.
 
 I expect to commit this in about 24 hours, so last call for comments
 please or say if you need more time to review.

I have made an attempt to merge this with master. This patch is not
meant to be the next version, and I have not made a real effort to be
sure this was a safe merge. I am just hoping this saves Simon some of
the tedious portions of the merge, and offers a reference point to see
where his merge differs from mine.

In the meantime, I'm looking at the original v10 patch against master as
it existed when Simon posted the patch.

I'd like to see checksums take part in the June commitfest.

Regards,
Jeff Davis


checksums-v10-merge.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.3: load path to mitigate load penalty for checksums

2012-06-12 Thread Jeff Davis

On Tue, 2012-06-12 at 22:06 -0400, Robert Haas wrote:
  How much of a savings did we get from PD_ALL_VISIBLE when it was added
  into the page-at-a-time visibility check?
 
  From 608195a3a3656145a7eec7a47d903bc684011d73:
 
  In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
  each heap page, also indicating that all tuples on the page are visible
  to all transactions. It's important that this flag is kept up-to-date.
  It is also used to skip visibility tests in sequential scans, which
  gives a small performance gain on seqscans.
 
  If small means that it's something we can give up, then focusing on
  HEAP_XMIN_COMMITTED makes sense. But if we can't give it up, then we
  need to take it into account in the proposal.
 
 It's significant.

In that case, the proposals that only involve HEAP_XMIN_COMMITTED don't
seem viable to me. To get maxiumum read speed, we will need to set
PD_ALL_VISIBLE also, and that means rewriting the entire table anyway
(for the workload that I'm describing).

However, maybe if we combine the approaches, readers could ignore
PD_ALL_VISIBLE during the load, which looks like maybe a 10% penalty,
without having to ignore HEAP_XMIN_COMMITTED (which seems much more
costly to ignore).

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-12 Thread Bruce Momjian

On Wed, Jun 06, 2012 at 06:46:37PM -0400, Tom Lane wrote:
 I wrote:
  Actually, it looks like there is an extremely simple way to handle this,
  which is to move the call of LogStandbySnapshot (which generates the WAL
  record in question) to before the checkpoint's REDO pointer is set, but
  after we have decided that we need a checkpoint.
 
 On further contemplation, there is a downside to that idea, which
 probably explains why the code was written as it was: if we place the
 XLOG_RUNNING_XACTS WAL record emitted during a checkpoint before rather
 than after the checkpoint's REDO point, then a hot standby slave
 starting up from that checkpoint won't process the XLOG_RUNNING_XACTS
 record.  That means its KnownAssignedXids machinery won't be fully
 operational until the master starts another checkpoint, which might be
 awhile.  So this could result in undesirable delay in hot standby mode
 becoming active.

Stupid question, but why are we not just setting a boolean variable in
shared memory if we WAL-write a non-XLOG_RUNNING_XACTS record, and only
checkpoint if that is true?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] COMMENT on function's arguments

2012-06-12 Thread Vlad Arkhipov

Does it make sense to have a comment on function's arguments? Of course 
it is possible to include these comments in a function's comment, but 
may be better to have them in more formalized way like comments on 
columns of a table. IDEs may use this information when providing hints 
for a function like in other languages.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLog changes for 9.3

2012-06-12 Thread Bruce Momjian

On Thu, Jun 07, 2012 at 02:52:04PM -0400, Tom Lane wrote:
  So you can pretty much kiss goodbye to the idea of pg_upgrade.
 
 And that is certainly nonsense.  I don't think pg_upgrade even knows
 about this, and if it does we can surely fix it.

pg_upgrade doesn't know anything about xlog files --- all its interaction
in that area is through pg_resetxlog and it doesn't look at the xlog
details.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-12 Thread Amit Kapila


 Why not 'dblink'?

We can do for dblink as well. I just wanted to check before implementing in
dblink.

I have checked the dblink_connect() function, it receives the connect string
and used in most cases that string to
call libpq connect which is different from pgbench and oid2name where
connection parameters are formed in main function and then call libpq
connect.

To achieve the same in dblink, we need to parse the passed connection string
and check if it contains fallback_application_name, if yes then its okay,
otherwise we need to append fallback_application_name in connection string.

The doubt I have is that what name to use as fallback_application_name
because here we cannot have argv as in pgbench or oid2name?
The 2 options which I can think of are:
1. Hard-coded name - dblink
2. postgres - which I think will be caller of dblink functionality

Please suggest if any of this option is okay or what is other way to get the
program name.

-Original Message-
From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: Wednesday, June 13, 2012 12:13 AM
To: Amit Kapila
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] WIP patch for Todo Item : Provide
fallback_application_name in contrib/pgbench, oid2name, and dblink

On Mon, Jun 11, 2012 at 5:30 AM, Amit Kapila amit.kap...@huawei.com wrote:
 As per the previous discussion in link below, it seems that fallback
 application name needs to be provided for only

 pgbench and oid2name.


http://archives.postgresql.org/message-id/w2g9837222c1004070216u3bc46b3ahbdd
fdffdbfb46...@mail.gmail.com



 However the title of Todo Item suggests it needs to be done for dblink as
 well.

 Please suggest if it needs to be done for dblink, if yes then what should
be
 fallback_application_name for it?

Why not 'dblink'?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-12 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 Stupid question, but why are we not just setting a boolean variable in
 shared memory if we WAL-write a non-XLOG_RUNNING_XACTS record, and only
 checkpoint if that is true?

Well, (1) we are trying to avoid adding such logic to the critical
section inside XLogInsert, and (2) XLOG_RUNNING_XACTS is not the only
problematic record type, there's at least one other.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb and fsync

2012-06-12 Thread Jeff Davis

On Sun, 2012-03-25 at 19:59 -0700, Jeff Davis wrote:
 On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote:
  I agree with Andres.
  
  
  I believe we should use sync_file_range (_before?) with linux.
  
  And we can use posix_fadvise_dontneed on other kernels.
  
 OK, updated patch attached. sync_file_range() is preferred,
 posix_fadvise() is a fallback.
 

Rebased patch attached. No other changes.

Regards,
Jeff Davis


initdb-fsync-20120612.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

94 matches

Mail list logo