Re: [HACKERS] Page Checksums

2011-12-18 Thread Heikki Linnakangas

On 17.12.2011 23:33, David Fetter wrote:

What:

 Please find attached a patch for 9.2-to-be which implements page
 checksums.  It changes the page format, so it's an initdb-forcing
 change.

How:
 In order to ensure that the checksum actually matches the hint
 bits, this makes a copy of the page, calculates the checksum, then
 sends the checksum and copy to the kernel, which handles sending
 it the rest of the way to persistent storage.
...
If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.


Hint bits, torn pages - failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Autonomous subtransactions

2011-12-18 Thread Gianni Ciolli
Dear Hackers,

I have written some notes about autonomous subtransactions, which have
already been touched (at least) in two separate threads; please find
them at

  http://wiki.postgresql.org/wiki/Autonomous_subtransactions

It is meant to be an ongoing project, requesting comments and
contributions, rather than a conclusive document.

Thanks,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

p.s. The reason for the variation on the name is that there is always
 a parent transaction.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread David Fetter
On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
 On 17.12.2011 23:33, David Fetter wrote:
 What:
 
  Please find attached a patch for 9.2-to-be which implements page
  checksums.  It changes the page format, so it's an initdb-forcing
  change.
 
 How:
  In order to ensure that the checksum actually matches the hint
  bits, this makes a copy of the page, calculates the checksum, then
  sends the checksum and copy to the kernel, which handles sending
  it the rest of the way to persistent storage.
 ...
 If this introduces new failure modes, please detail, and preferably
 demonstrate, just what those new modes are.
 
 Hint bits, torn pages - failed CRC. See earlier discussion:
 
 http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.  Instead, copy of the page
has already hit storage before the torn write occurs.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Jan Urbański
On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.
 
 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

I think having a JSON data type in core would drastically limit the
exposure third-party JSON extensions would get and that's bad. There are
tons of interesting features a JSON type could have and tying its
development to a one year release cycle might be a disservice both for
people who are willing to provide these features earlier, the users
which are faced with a choice between a fast-moving third-party addon
and a blessed core type and would cause overall confusion.

How about we try the tsearch way and let JSON extensions live outside
core for some time and perhaps if one emerges dominant and would benefit
from inclusion then consider it?

If we keep treating extensions as second-class citizens, they'll never
get the mindshare and importance we seem to want for them (or otherwise
why go through all the trouble to provide an infrastructure for them).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Pavel Stehule
2011/12/18 Jan Urbański wulc...@wulczer.org:
 On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.

 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

 I think having a JSON data type in core would drastically limit the
 exposure third-party JSON extensions would get and that's bad. There are
 tons of interesting features a JSON type could have and tying its
 development to a one year release cycle might be a disservice both for
 people who are willing to provide these features earlier, the users
 which are faced with a choice between a fast-moving third-party addon
 and a blessed core type and would cause overall confusion.

 How about we try the tsearch way and let JSON extensions live outside
 core for some time and perhaps if one emerges dominant and would benefit
 from inclusion then consider it?

it should be contrib modules

Pavel


 If we keep treating extensions as second-class citizens, they'll never
 get the mindshare and importance we seem to want for them (or otherwise
 why go through all the trouble to provide an infrastructure for them).

 Cheers,
 Jan

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Heikki Linnakangas

On 18.12.2011 10:54, David Fetter wrote:

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.


Hint bits, torn pages -  failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php


The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.


Doesn't help. Hint bit updates are not WAL-logged.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 17, 2011 at 5:02 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 I'd like to add some confusion on the implementation choice, because it
 looks damn too easy now… Guile 2.0 offers an implementation of the
 ECMAscript language and plscheme already exists as a PostgreSQL PL
 extension for integrating with Guile.

 It seems like the licensing there could potentially be problematic.
 It's GPL with a linking exception.  Not sure we want to go there.

It's LGPL so it's compatible (only the readline part is subject to GPL,
we're familiar enough with that though).

  http://www.gnu.org/software/guile/docs/docs-2.0/guile-ref/Guile-License.html

  The Guile library (libguile) and supporting files are published under
  the terms of the GNU Lesser General Public License version 3 or later.
  See the files COPYING.LESSER and COPYING.

  C code linking to the Guile library is subject to terms of that
  library. Basically such code may be published on any terms, provided
  users can re-link against a new or modified version of Guile.

  Scheme level code written to be run by Guile (but not derived from
  Guile itself) is not restricted in any way, and may be published on
  any terms. We encourage authors to publish on Free terms.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Dimitri Fontaine
Bruce Momjian br...@momjian.us writes:
 We have a big user community and what _you_ want for this feature is
 only a small part of our decision on what is needed.  Robert's concern
 that this might not be useful enough for the general use-cases people
 want is a legitimate, if difficult to hear, analysis.

Agreed, his concern is legitimate.  Now, I've never been trying to
implement a generic event trigger system and I don't know what it would
take to implement such a beast.

Transaction BEGIN, COMMIT, and ROLLBACK triggers anyone?  not me :)

Exploring if my proposal would be a pain to maintain once we have a fully
generic event trigger system someday is legitimate, asking me to design
both the generic event system and the command triggers so that they fit
in is just asking for too much.

The main part of my answer, though, is that all the more complex use
cases involving command triggers that Robert is offering are in fact
possible to implement with what my patch is providing, as soon as you're
ok with understanding the content and format of the nodeToString()
output.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to allow users to kill their own queries

2011-12-18 Thread Magnus Hagander
On Fri, Dec 16, 2011 at 13:31, Greg Smith g...@2ndquadrant.com wrote:
 On 12/14/2011 05:24 AM, Magnus Hagander wrote:

 How about passing a parameter to pg_signal_backend? Making
 pg_signal_backend(int pid, int sig, bool allow_samerole)?



 That works, got rid of the parts I didn't like and allowed some useful minor
 restructuring.  I also made the HINT better and match style guidelines:

 gsmith= select pg_terminate_backend(21205);

 ERROR:  must be superuser to terminate other server processes
 HINT:  You can cancel your own processes with pg_cancel_backend().
 gsmith= select pg_cancel_backend(21205);
  pg_cancel_backend
 ---
  t

 New rev attached and pushed to
 https://github.com/greg2ndQuadrant/postgres/tree/cancel-backend (which is
 *not* the same branch as I used last time; don't ask why, long story)

 I considered some additional ways to restructure the checks that could
 remove a further line or two from the logic here, but they all made the
 result seem less readable to me.  And this is not a high performance code
 path.  I may have gone a bit too far with the comment additions though, so
 feel free to trim that back.  It kept feeling weird to me that none of the
 individual signaling functions had their own intro comments.  I added all
 those.

 I also wrote up a commentary on the PID wraparound race condition
 possibility Josh brought up.  Some research shows that pid assignment on
 some systems is made more secure by assigning new ones randomly.  That seems
 like it would make it possible to have a pid get reused much faster than on
 the usual sort of system that does sequential assignment and wraparound.  A
 reuse collision still seems extremely unlikely though.  With the new
 comments, at least a future someone who speculates on this will know how
 much thinking went into the current implementation:  enough to notice, not
 enough to see anything worth doing about it.  Maybe that's just wasted lines
 of text?

 With so little grief on the last round, I'm going to guess this one will
 just get picked up by Magnus to commit next.  Marking accordingly and moved
 to the current CommitFest.

I was going to, but I noticed a few things:

* I restructured the if statements, because I had a hard time
following the comments around that ;) I find this one easier - but I'm
happy to change back if you think your version was more readable.

* The error message in pg_signal_backend breaks the abstraction,
because it specifically talks about terminating the other backend -
when it's not supposed to know about that in that function. I think we
either need to get rid of the hint completely, or we need to find a
way to issue it from the caller. Or pass it as a parameter. It's fine
for now since we only have two signals, but we might have more in the
future..

* I gave it a run of pgindent ;)


In the attached updated patch I've just removed the HINT and changed
the reference fromterminate to signal. But I'd like your input
onthat before I commit :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e7f7fe0..cf77586 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14244,8 +14244,8 @@ SELECT set_config('log_statement_stats', 'off', false);
para
 The functions shown in xref
 linkend=functions-admin-signal-table send control signals to
-other server processes.  Use of these functions is restricted
-to superusers.
+other server processes.  Use of these functions is usually restricted
+to superusers, with noted exceptions.
/para
 
table id=functions-admin-signal-table
@@ -14262,7 +14262,10 @@ SELECT set_config('log_statement_stats', 'off', false);
 literalfunctionpg_cancel_backend(parameterpid/parameter typeint/)/function/literal
 /entry
entrytypeboolean/type/entry
-   entryCancel a backend's current query/entry
+   entryCancel a backend's current query.  You can execute this against
+another backend that has exactly the same role as the user calling the
+function.  In all other cases, you must be a superuser.
+/entry
   /row
   row
entry
@@ -14304,6 +14307,10 @@ SELECT set_config('log_statement_stats', 'off', false);
 commandpostgres/command processes on the server (using
 applicationps/ on Unix or the applicationTask
 Manager/ on productnameWindows/).
+For the less restrictive functionpg_cancel_backend/, the role of an
+active backend can be found from
+the structfieldusename/structfield column of the
+structnamepg_stat_activity/structname view.
/para
 
para
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 7a2e0c8..d7f2435 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -30,6 +30,7 @@
 #include postmaster/syslogger.h
 #include storage/fd.h
 #include storage/pmsignal.h

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Magnus Hagander
On Sun, Dec 18, 2011 at 10:49, Jan Urbański wulc...@wulczer.org wrote:
 On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.

 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

 I think having a JSON data type in core would drastically limit the
 exposure third-party JSON extensions would get and that's bad. There are

The same way that having replication in core is bad for the rest of
the replication engines? While it has certainly decreased the usage of
for example Slony, I don't think anybody can say it's a bad thing that
we have this in core...

And of course, *not* having it in core, we didn't have people claiming
for many years that postgres has no replication or anything like
that... The fact is that a *lot* of our users, particularly in large
companies, will never install an extension that's not part of core.
Just look at other discussions about it even being a problem with it
being in *contrib*, which is still maintained and distributed by the
same developers.

We can hopefully get around this for the extensions in contrib (and
reasonably well has already), but few large companies are going to be
happy to go to pgxn and download an extension that has a single
maintainer (not the team, and in most cases not even a team),
usually no defined lifecycle, no support, etc. (I'm pretty sure you
won't get support included for random pgxn modules when you buy a
contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
if it the datatype is in core, you *will* get this)

So I'm not sure it would really lessen the exposure much at all -
those that are willing to install such extensions already, are surely
capable of finding it themselves (using pgxn for example - or even
google)


 tons of interesting features a JSON type could have and tying its
 development to a one year release cycle might be a disservice both for
 people who are willing to provide these features earlier, the users
 which are faced with a choice between a fast-moving third-party addon
 and a blessed core type and would cause overall confusion.

And the other option would be to *only* have a fast-moving third-party
addon, which simply disqualifies it completely in many environments.

Keeping it as a third party addon is better for the developer. Keeping
it in core is better for the user (if the user is a large company -
not a hacker).

If we can find a way to have a stable part in core and then have
addons that can provide these tons of interesting features (which I
agree there are) until such time that they can be considered stable
enough for core, I think that's the best compromise.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to allow users to kill their own queries

2011-12-18 Thread Robert Haas
On Sat, Dec 17, 2011 at 11:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think this argument is bogus: if this is a real issue, then no use of
 kill() anytime, by anyone, is safe.  In practice I believe that Unix
 systems avoid recycling PIDs right away so as to offer some protection.

I'm not sure they do anything more sophisticated than cycling through
a sufficiently-large PID space, but whether it's that or something
else, I guess it must be adequate or they'd have enlarged the space...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Bruce Momjian
Dimitri Fontaine wrote:
 Bruce Momjian br...@momjian.us writes:
  We have a big user community and what _you_ want for this feature is
  only a small part of our decision on what is needed.  Robert's concern
  that this might not be useful enough for the general use-cases people
  want is a legitimate, if difficult to hear, analysis.
 
 Agreed, his concern is legitimate.  Now, I've never been trying to
 implement a generic event trigger system and I don't know what it would
 take to implement such a beast.
 
 Transaction BEGIN, COMMIT, and ROLLBACK triggers anyone?  not me?:)
 
 Exploring if my proposal would be a pain to maintain once we have a fully
 generic event trigger system someday is legitimate, asking me to design
 both the generic event system and the command triggers so that they fit
 in is just asking for too much.

Agreed.  I am not against this addition, just pointing out that we have
to be careful.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous subtransactions

2011-12-18 Thread Jim Nasby
On Dec 18, 2011, at 2:28 AM, Gianni Ciolli wrote:
 I have written some notes about autonomous subtransactions, which have
 already been touched (at least) in two separate threads; please find
 them at
 
  http://wiki.postgresql.org/wiki/Autonomous_subtransactions

The document seems to mix the terms subtransaction and autonomous transaction. 
That's going to generate a ton of confusion, because both terms already have 
meaning associated with them:

- Autonomous transaction means you can execute something outside of your 
current transaction and it is in no way effected by the current transaction 
(doesn't matter if T0 commits or not).
- Subtransactions are an alternative to savepoints. They allow you to break a 
large transaction into smaller chunks, but if T0 doesn't commit then none of 
the subtransactions do either.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cursor behavior

2011-12-18 Thread amit sehas
HI,

I had a question about the cursor internals implementation. When you Fetch next 
'n' results without moving the cursors, is this kind of functionality 
implemented by firstly executing the whole query and then moving the cursor 
over the results, or are the movements done on active database lookups, moving 
forward and backward...

it seems that the cursor implementation functions evaluate the whole query and 
then return results as requested, it would seem to be too difficult to 
otherwise support a functionality that returns next 10 results without moving 
the cursors... we need to determine this since in our case we are
dealing with a query that returns millions of records, and we would rather the 
cursor did not evaluate the whole query upfront...

any help is greatly appreciated...

thanks

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] array behavior

2011-12-18 Thread amit sehas
If a field(attribute) in a type is declared to be an array. Then is it always 
the case that the array will be packed into a single tuple that resides in the 
database. There is the TOAST mechanism for oversized tuples but that is still 
considered to be a single tuple. Is there any circumstance in which an 
attribute which is an array will be broken up into individual
tuples which are somehow associated with the main tuple. Such as if the array 
happens to have 5000 elements and the tuple will become quite large if these 
are packed within a single tuple and additionally it may have undesirable 
performance impact if the queries are not even interested in seeing the array 
when fetching the object ?

thanks

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Measuring relation free space

2011-12-18 Thread Noah Misch
On Fri, Dec 16, 2011 at 02:02:03AM -0500, Greg Smith wrote:
 On 12/15/2011 04:11 PM, Noah Misch wrote:
 Is one of those index figures simply wrong, or do they measure two senses of
 free space, both of which are interesting to DBAs?

 I think the bigger one--the one I was aiming to measure--also includes  
 fill-factor space.  It should be possible to isolate whether that's true  
 by running the function against a fresh index, or by trying tests with a  
 table where there's no useful fill.  I need to add some of those to the  
 test example suite.

No, both measures include fillfactor space.  From a brief look at the code, the
proposed function counts space in non-leaf pages, while pgstattuple does not.
Also, the proposed function counts half-dead pages like live pages, while
pgstattuple counts them like dead pages.

One could perhaps justify those choices either way, but they seem too esoteric
for DBA exposure.  I recommend choosing a policy on each and making both
pgstattuple() and any new code respect that policy.

 Shaking out the alternate implementation ideas was really my goal for  
 this CF here.  The major goal of the next revision is to present the  
 options with a measure of their respective accuracy and runtime.  If I  
 have to give up just a of bit of accuracy and make it much faster,  
 that's probably what most people want as an option.  When Jaime and I  
 come back with an update, it really needs to have benchmarks and  
 accuracy numbers for each option.  That may be complicated a bit  
 depending on how much of the table or index is cached, so isolating that  
 out will be a pain.

The previous submission seemed to boil down to a speedier version of SELECT
free_percent FROM pgstattuple('foo').  (Some of the other statistics aren't
cheap.)  Considering that, the code does belong in the pgstattuple module.

The sampling approach you have mentioned sounds promising, especially for
indexes.  For heap bloat, it may be hard to improve on pg_freespacemap-based and
check_postgres-style estimates with anything less than a full heap scan.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-12-18 Thread Josh Berkus
Greg,

 You keep asking the hard questions.

I practice. ;-)

 Right now, I kind of like that it's
 possible to copy a postgresql.conf file from master to standby and just
 use it.  That should still be possible with the realignment into GUCs:

... long discussion omitted here.

I agree that GUC vs. standby.enabled is a trade-off.  I further agree
that where we're going with this eventually is SET PERSISTENT.  I feel
that Greg's proposal is a substantial improvement on the current
arrangement and eliminates *my* #1 source of replication-configuration
pain, and we can keep improving it later.

I think we need to give some thought as to how this will play out for
PITR, since there is far less reason to change the operation of PITR,
and much older backup tools which rely on its current operation.

Otherwise, +1.

 shove all into one release.  There's a simple path from there that leads
 to both easier tools all around and SET PERSISTENT, and it comes with a
 pile of disruption so big I could throw in standby controls are now
 100% GUC for you plus a unicorn and it would slip right by unnoticed. 
 That's a tough roadmap to sell unless those promised benefits are proven
 first though.  And I'm thinking a release doing all that is going to
 want to be named 10.0--and what I could really use is a nice, solid 9.2
 that doesn't scare enterprises with too much change next.

I would love to see a writeup on this someday.  Blog?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure
On Sat, Dec 17, 2011 at 4:02 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote: to add some confusion on the
implementation choice, because it
 looks damn too easy now… Guile 2.0 offers an implementation of the 
 ECMAscript language and plscheme already exists as a PostgreSQL PL extension 
 for integrating with Guile.

TBH, I think that's PFC (pretty cool).
On Sun, Dec 18, 2011 at 6:41 AM, Magnus Hagander mag...@hagander.net wrote:
 We can hopefully get around this for the extensions in contrib (and
 reasonably well has already), but few large companies are going to be
 happy to go to pgxn and download an extension that has a single
 maintainer (not the team, and in most cases not even a team),
 usually no defined lifecycle, no support, etc. (I'm pretty sure you
 won't get support included for random pgxn modules when you buy a
 contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
 if it the datatype is in core, you *will* get this)

100% agree on all points.  with the new extension system, contrib
modules that are packaged with the core system can be considered to be
in core because they are:
*) documented in standard docs
*) supported and bugfixed with postgresql releases
*) ready to be used without compiler support or even shell access
through most binary distributions

One small note about the json type being an extension -- this probably
means the json type oid won't be fixed -- not a huge deal but it could
affect some corner cases with binary format consumers.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Andrew Dunstan



On 12/18/2011 12:17 PM, Merlin Moncure wrote:

One small note about the json type being an extension -- this probably
means the json type oid won't be fixed -- not a huge deal but it could
affect some corner cases with binary format consumers.





Why would that matter more for JSON than for any other non-core type?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure
On Sun, Dec 18, 2011 at 11:21 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 12/18/2011 12:17 PM, Merlin Moncure wrote:

 One small note about the json type being an extension -- this probably
 means the json type oid won't be fixed -- not a huge deal but it could
 affect some corner cases with binary format consumers.

 Why would that matter more for JSON than for any other non-core type?

well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
and only then for high traffic types (which presumably json will be).
 a while back we coded up a reworked dblink that was variadic and
could optionally transfer data between database with the binary wire
format.   any container of a user defined (by oid) type had to be sent
strictly as text which is a big performance hit for certain types.
recent postgres has an undocumented facility to force type oids to a
particular value, but the type definition being inside the create
extension script makes this problematic.

this is a pretty far out objection though, and I could certainly work
around the problem if necessary, but there is some dependency on
pg_types.h in the wild.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Dimitri Fontaine
Merlin Moncure mmonc...@gmail.com writes:
 Why would that matter more for JSON than for any other non-core type?

 well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
 and only then for high traffic types (which presumably json will be).

Extensions are going to be more and more used and “pervasive” in next
years, and binary wire transfers is a good goal.  What about creating
something like the PostgreSQL types IANA?

New type authors would register their OID and as a benefit would get
listed on some public reference sheet, and we could add some mechanism
so that default CREATE TYPE calls will not use reserved OID numbers.

Then it would be all cooperative only, so not a security thing, just a
way to ease binary and extension co-existence.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread David Fetter
On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:
 On 18.12.2011 10:54, David Fetter wrote:
 On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
 On 17.12.2011 23:33, David Fetter wrote:
 If this introduces new failure modes, please detail, and preferably
 demonstrate, just what those new modes are.
 
 Hint bits, torn pages -  failed CRC. See earlier discussion:
 
 http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php
 
 The patch requires that full page writes be on in order to obviate
 this problem by never reading a torn page.
 
 Doesn't help. Hint bit updates are not WAL-logged.

What new failure modes are you envisioning for this case?  Any way to
simulate them, even if it's by injecting faults into the source code?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 The main part of my answer, though, is that all the more complex use
 cases involving command triggers that Robert is offering are in fact
 possible to implement with what my patch is providing, as soon as you're
 ok with understanding the content and format of the nodeToString()
 output.

Hmm ... I don't think that I *am* ok with that.  ISTM that we'd then
find ourselves with any changes in utility statement parse trees
amounting to a user-visible API break, and that's not an acceptable
situation.

We already have this issue of course with respect to C-code add-ons,
but (1) we've established an understanding that people should have to
recompile those for every major release, and (2) changes such as adding
a new field, or even changing an existing field that you don't care
about, don't break C source code.  I don't know exactly what you're
imagining that user-written triggers would do with nodeToString strings,
but I'd bet a good lunch that people will use ad-hoc interpretation
methods that are not robust against changes at all.  And then they'll
blame us when their triggers break --- not unreasonably, because we
failed to provide a sane API for them to use.

We really need some higher-level API than the raw parse tree, and
I have to admit that I have no idea what that would look like.
But exposing parse trees to user-written triggers is a decision
that we will come to regret, probably as soon as the next release.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Heikki Linnakangas

On 18.12.2011 20:44, David Fetter wrote:

On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:

On 18.12.2011 10:54, David Fetter wrote:

On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:

On 17.12.2011 23:33, David Fetter wrote:

If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.


Hint bits, torn pages -   failed CRC. See earlier discussion:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01975.php


The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.


Doesn't help. Hint bit updates are not WAL-logged.


What new failure modes are you envisioning for this case?


Umm, the one explained in the email I linked to... Let me try once more. 
For the sake of keeping the example short, imagine that the PostgreSQL 
block size is 8 bytes, and the OS block size is 4 bytes. The CRC is 1 
byte, and is stored on the first byte of each page.


In the beginning, a page is in the buffer cache, and it looks like this:

AA 12 34 56  78 9A BC DE

AA is the checksum. Now a hint bit on the last byte is set, so that the 
page in the shared buffer cache looks like this:


AA 12 34 56  78 9A BC DF

Now PostgreSQL wants to evict the page from the buffer cache, so it 
recalculates the CRC. The page in the buffer cache now looks like this:


BB 12 34 56  78 9A BC DF

Now, PostgreSQL writes the page to the OS cache, with the write() system 
call. It sits in the OS cache for a few seconds, and then the OS decides 
to flush the first 4 bytes, ie. the first OS block, to disk. On disk, 
you now have this:


BB 12 34 56  78 9A BC DE

If the server now crashes, before the OS has flushed the second half of 
the PostgreSQL page to disk, you have a classic torn page. The updated 
CRC made it to disk, but the hint bit did not. The CRC on disk is not 
valid, for the rest of the contents of that page on disk.


Without CRCs, that's not a problem because the data is valid whether or 
not the hint bit makes it to the disk. It's just a hint, after all. But 
when you have a CRC on the page, the CRC is only valid if both the CRC 
update *and* the hint bit update makes it to disk, or neither.


So you've just turned an innocent torn page, which PostgreSQL tolerates 
just fine, into a block with bad CRC.


 Any way to
 simulate them, even if it's by injecting faults into the source code?

Hmm, it's hard to persuade the OS to suffer a torn page on purpose. What 
you could do is split the write() call in mdwrite() into two. First 
write the 1st half of the page, then the second. Then you can put a 
breakpoint in between the writes, and kill the system before the 2nd 
half is written.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Peter Eisentraut
On sön, 2011-12-18 at 21:34 +0200, Heikki Linnakangas wrote:
 On 18.12.2011 20:44, David Fetter wrote:
   Any way to
   simulate them, even if it's by injecting faults into the source code?
 
 Hmm, it's hard to persuade the OS to suffer a torn page on purpose. What 
 you could do is split the write() call in mdwrite() into two. First 
 write the 1st half of the page, then the second. Then you can put a 
 breakpoint in between the writes, and kill the system before the 2nd 
 half is written.

Perhaps the Library-level Fault Injector (http://lfi.sf.net) could be
used to set up a test for this.  (Not that I think you need one, but if
David wants to see it happen himself ...)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Jesper Krogh

On 2011-12-18 11:19, Heikki Linnakangas wrote:

The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.


Doesn't help. Hint bit updates are not WAL-logged.


I dont know if it would be seen as a half baked feature.. or similar,
and I dont know if the hint bit problem is solvable at all, but I could
easily imagine checksumming just skipping the hit bit entirely.

It would still provide checksumming for the majority of the data sitting
underneath the system, and would still be extremely usefull in my
eyes.

Jesper
--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting plpython into smaller parts

2011-12-18 Thread Peter Eisentraut
On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote:
 Rebased against master after the SPI cursor patch has been committed.
 
 The first patch removes SPI boilerplate from the cursor functions as
 well and the second patch creates a plpython_cursor.c file.
 
 A side effect of creating a separate file for cursors is that I had to
 make PLy_spi_transaction_{begin,commit,abort} helper functions external
 since they're used both by regular SPI execution functions and the
 cursor functions.
 
 They live the plpython_spi.c which is not an ideal place for them, but
 IMHO it's not bad either.

Committed now.

I moved a few more things around.  I split up the plpython.h header file
to create a separate header file for each .c file, so that the
hierarchical releationship of the modules is clearer.  (The only cases
of circular relationships should be caused by use of global variables.)
Also, I named the files that contain classes or modules more like they
are in the CPython source code, e.g., plpy_cursorobject.c.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Hmm ... I don't think that I *am* ok with that.  ISTM that we'd then
 find ourselves with any changes in utility statement parse trees
 amounting to a user-visible API break, and that's not an acceptable
 situation.

Oh, you mean like exposing the parser for syntax coloring etc.  I failed
to see it's the same case.  Do we have an acceptable proposal on that
front yet?

 We already have this issue of course with respect to C-code add-ons,
 but (1) we've established an understanding that people should have to
 recompile those for every major release, and (2) changes such as adding
 a new field, or even changing an existing field that you don't care
 about, don't break C source code.  I don't know exactly what you're
 imagining that user-written triggers would do with nodeToString strings,
 but I'd bet a good lunch that people will use ad-hoc interpretation
 methods that are not robust against changes at all.  And then they'll
 blame us when their triggers break --- not unreasonably, because we
 failed to provide a sane API for them to use.

Could we offer people a sane API?

Another way could be to bypass BEFORE triggers and let people look at
the catalogs in the AFTER command trigger, and give them the object oid,
name and schemaname for them to do their lookups.

You can still RAISE EXCEPTION in an AFTER command trigger to cancel the
command execution, what you can not do anymore is canceling the command
without killing the current transaction.

 We really need some higher-level API than the raw parse tree, and
 I have to admit that I have no idea what that would look like.
 But exposing parse trees to user-written triggers is a decision
 that we will come to regret, probably as soon as the next release.

I was under the illusion that providing users with ready to tweak
examples of robust-against-changes code would cut it.  I'd like the
command triggers patch not to depend on designing this API we need.

What do you think of removing the parsetree and the BEFORE trigger
support (so that trigger function can query the catalogs)?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting plpython into smaller parts

2011-12-18 Thread Jan Urbański
On 18/12/11 20:53, Peter Eisentraut wrote:
 On tis, 2011-12-06 at 00:58 +0100, Jan Urbański wrote:
 Rebased against master after the SPI cursor patch has been committed.

 The first patch removes SPI boilerplate from the cursor functions as
 well and the second patch creates a plpython_cursor.c file.

 A side effect of creating a separate file for cursors is that I had to
 make PLy_spi_transaction_{begin,commit,abort} helper functions external
 since they're used both by regular SPI execution functions and the
 cursor functions.

 They live the plpython_spi.c which is not an ideal place for them, but
 IMHO it's not bad either.
 
 Committed now.

Great, thanks! I hope this will make for a more maintanable PL/Python.

By the way, the buildfarm is turning red because it's missing the
attached patch.

Cheers,
Jan
diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
index 12ce26e..a31328d 100644
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** endif # can't build
*** 189,195 
  # distprep and maintainer-clean rules should be run even if we can't build.
  
  # Force this dependency to be known even without dependency info built:
! plpython_plpy.o: spiexceptions.h
  
  spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl
  	$(PERL) $(srcdir)/generate-spiexceptions.pl $  $@
--- 189,195 
  # distprep and maintainer-clean rules should be run even if we can't build.
  
  # Force this dependency to be known even without dependency info built:
! plpy_plpymodule.o: spiexceptions.h
  
  spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl
  	$(PERL) $(srcdir)/generate-spiexceptions.pl $  $@

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 We really need some higher-level API than the raw parse tree, and
 I have to admit that I have no idea what that would look like.
 But exposing parse trees to user-written triggers is a decision
 that we will come to regret, probably as soon as the next release.

 I was under the illusion that providing users with ready to tweak
 examples of robust-against-changes code would cut it.  I'd like the
 command triggers patch not to depend on designing this API we need.

Well, we don't have any such examples, because frankly the nodeToString
representation is pretty damn unfriendly.  The only code we have that
does anything with it at all is the readfuncs.c code that turns it back
into trees of C structs, and that's no help for triggers not themselves
written in C.  Besides which, readfuncs.c is the poster child for code
that does have to change every time we tweak the struct definitions.
We can't tell people to copy that approach.

 What do you think of removing the parsetree and the BEFORE trigger
 support (so that trigger function can query the catalogs)?

Well, it gets us out of the business of inventing a suitable API,
but I think it also reduces the feature to a point of near uselessness.
Essentially we'd be saying to trigger authors something changed, feel
free to inspect the catalogs and see if you can guess what.

Just because the problem is hard doesn't mean you can get away with
not solving it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Well, we don't have any such examples, because frankly the nodeToString
 representation is pretty damn unfriendly.  The only code we have that

I tend to agree here, but I know that Jan is convincing enough when he's
saying that it is in fact very friendly.

 does anything with it at all is the readfuncs.c code that turns it back
 into trees of C structs, and that's no help for triggers not themselves
 written in C.  Besides which, readfuncs.c is the poster child for code
 that does have to change every time we tweak the struct definitions.
 We can't tell people to copy that approach.

Providing the same nested C structs thingy in python or perl or tcl
might be feasible and not too sketchy to maintain, but I'm failing to
see how to even approach that for plpgsql.

 What do you think of removing the parsetree and the BEFORE trigger
 support (so that trigger function can query the catalogs)?

 Well, it gets us out of the business of inventing a suitable API,
 but I think it also reduces the feature to a point of near uselessness.

Not being generic and flexible is not the same thing as not being of any
use at all.  Extension whitelisting is still possible to implement
because all you need to know is the extension's name, then you choose to
let the command string you're given execute or not.  Same with
replication or simple auditing cases, you still have the plain command
string to play with.

Not useful enough for being what we ship in 9.2, I can follow you there,
not useful at all, disagreed.

 Essentially we'd be saying to trigger authors something changed, feel
 free to inspect the catalogs and see if you can guess what.

No, we'd also be providing the main OID of the object that changed (a
pg_class entry for a CREATE TABLE command, etc), the object name and its
schema name too.  And the command string too.  ALTER TABLE is still
difficult to handle, other more simple commands might be ok.

 Just because the problem is hard doesn't mean you can get away with
 not solving it.

That is the single simplest way of handling it, though, so I had to try
that first.  Now, maybe we can find the right approach to publishing the
parse tree this month still.  Any ideas welcome!

I guess XML would be ok but we don't embed powerful enough tools, and
JSON might be perfect but we would need to have a full blown datatype
and functions to work with that from plpgsql.  What other tree-ish data
type can we have?

EXPLAIN is already able to spit out XML and JSON (and YAML) but the
typical client consuming that output is not running as a backend stored
procedure, so I guess that's not a precedent and we still need something
with a good support (type, operators, walking functions…) to back it.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Greg Stark
On Sun, Dec 18, 2011 at 7:51 PM, Jesper Krogh jes...@krogh.cc wrote:
 I dont know if it would be seen as a half baked feature.. or similar,
 and I dont know if the hint bit problem is solvable at all, but I could
 easily imagine checksumming just skipping the hit bit entirely.

That was one approach discussed. The problem is that the hint bits are
currently in each heap tuple header which means the checksum code
would have to know a fair bit about the structure of the page format.
Also the closer people looked the more hint bits kept turning up
because the coding pattern had been copied to other places (the page
header has one, and index pointers have a hint bit indicating that the
target tuple is deleted, etc). And to make matters worse skipping
individual bits in varying places quickly becomes a big consumer of
cpu time since it means injecting logic into each iteration of the
checksum loop to mask out the bits.

So the general feeling was that we should move all the hint bits to a
dedicated part of the buffer so that they could all be skipped in a
simple way that doesn't depend on understanding the whole structure of
the page. That's not conceptually hard, it's just a fair amount of
work. I think that's where it was left off.

There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information. Rright now
people don't really know if their system can experience torn pages or
not and having some way of detecting them could be useful. And if you
have other unexplained symptoms then having checksum errors might be
enough evidence that the investigation should start with the hardware
and get the sysadmin looking at hardware logs and running memtest
sooner.




-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2011-12-18 Thread Robert Haas
On Sun, Dec 18, 2011 at 5:11 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 I guess XML would be ok but we don't embed powerful enough tools, and
 JSON might be perfect but we would need to have a full blown datatype
 and functions to work with that from plpgsql.  What other tree-ish data
 type can we have?

 EXPLAIN is already able to spit out XML and JSON (and YAML) but the
 typical client consuming that output is not running as a backend stored
 procedure, so I guess that's not a precedent and we still need something
 with a good support (type, operators, walking functions…) to back it.

Right.  If we're actually going to expose the parse tree, I think JSON
(or even XML) would be a far better way to expose that than the
existing nodeToString() output.  Sure, you could make due with the
nodeToString() output for some things, especially in PL/perl or
PL/python.  But JSON would be far better, since it's a standard format
rather than something we just made up, and could be used in PL/pgsql
as well, given proper support functions.

Another option would be to do something like this:

CREATE TYPE pg_trigger_on_create_table AS (
catalog_name text,
schema_name text,
relation_name text,
...
);

That's not materially different from exposing the parse tree, but it's
more convenient for PL/pgsql and doesn't require adding a new datatype
like JSON.  It might require an awful lot of tuple-construction code
and datatype definitions, though.

Still another option would be to expose some of this information
through magical variables or functions, sort of like the way that
declaring a function to return trigger causes it to have NEW and OLD.
It could have STUFF.catalog_name, STUFF.schema_name,
STUFF.relation_name, or whatever we want.

None of these approaches really get around the fact that if the
command syntax changes, the trigger API has to change, too.  You might
be able to get around that for CREATE commands by having only AFTER
triggers, and just passing the OID; and for DROP commands by having
only BEFORE triggers, and just passing the OID.  But I don't see any
way to make it work very well for ALTER commands.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Josh Berkus
On 12/18/11 5:55 PM, Greg Stark wrote:
 There is another way to look at this problem. Perhaps it's worth
 having a checksum *even if* there are ways for the checksum to be
 spuriously wrong. Obviously having an invalid checksum can't be a
 fatal error then but it might still be useful information. Rright now
 people don't really know if their system can experience torn pages or
 not and having some way of detecting them could be useful. And if you
 have other unexplained symptoms then having checksum errors might be
 enough evidence that the investigation should start with the hardware
 and get the sysadmin looking at hardware logs and running memtest
 sooner.

Frankly, if I had torn pages, even if it was just hint bits missing, I
would want that to be logged.  That's expected if you crash, but if you
start seeing bad CRC warnings when you haven't had a crash?  That means
you have a HW problem.

As long as the CRC checks are by default warnings, then I don't see a
problem with this; it's certainly better than what we have now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure
On Sun, Dec 18, 2011 at 12:26 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Why would that matter more for JSON than for any other non-core type?

 well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
 and only then for high traffic types (which presumably json will be).

 Extensions are going to be more and more used and “pervasive” in next
 years, and binary wire transfers is a good goal.  What about creating
 something like the PostgreSQL types IANA?

 New type authors would register their OID and as a benefit would get
 listed on some public reference sheet, and we could add some mechanism
 so that default CREATE TYPE calls will not use reserved OID numbers.

 Then it would be all cooperative only, so not a security thing, just a
 way to ease binary and extension co-existence.

I think that's a fabulous idea,although we're drifting off the stated
topic here.


Getting back on point, I'm curious about your statement: without
writing a single line of C.  I took a look at the pl/scheme docs and
was pretty impressed -- what exactly would be involved to get a
guile-based ECMAscript working over the pl/scheme implementation?  How
would that interact exactly with the stated topic -- JSON support?  Do
you even need a json type if you have strong library based parsing and
composition features?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-18 Thread Aidan Van Dyk
On Sun, Dec 18, 2011 at 11:21 PM, Josh Berkus j...@agliodbs.com wrote:
 On 12/18/11 5:55 PM, Greg Stark wrote:
 There is another way to look at this problem. Perhaps it's worth
 having a checksum *even if* there are ways for the checksum to be
 spuriously wrong. Obviously having an invalid checksum can't be a
 fatal error then but it might still be useful information. Rright now
 people don't really know if their system can experience torn pages or
 not and having some way of detecting them could be useful. And if you
 have other unexplained symptoms then having checksum errors might be
 enough evidence that the investigation should start with the hardware
 and get the sysadmin looking at hardware logs and running memtest
 sooner.

 Frankly, if I had torn pages, even if it was just hint bits missing, I
 would want that to be logged.  That's expected if you crash, but if you
 start seeing bad CRC warnings when you haven't had a crash?  That means
 you have a HW problem.

 As long as the CRC checks are by default warnings, then I don't see a
 problem with this; it's certainly better than what we have now.

But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a non event in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and scrub every page in the database.

So you could have a crash, then a recovery, and a couple clean
shutdown-restart combinations before you happen to read the needed
page that was torn in the crash $X [ days | weeks | months ] ago.
It's specifically because PostgreSQL was *DESIGNED* to make torn pages
a non-event (because WAL/FPW fixes anything that's dangerous), that
the whole CRC issue is so complicated...

I'll through out a few random thoughts (some repeated) that people who
really want the CRC can fight over:

1) Find a way to not bother writing out hint-bit-only-dirty pages
 I know people like Kevin keep recommending a vacuum freeze after a
big load to avoid later problems anyways and I think that's probably
common in big OLAP shops, and OLTP people are likely to have real
changes on the page anyways.  Does anybody want to try and measure
what type of performance trade-offs we'ld really have on a variety of
normal (ya, I know, what's normal) workloads?  If the page has a
real change, it's got a WAL FPW, so we avoid the problem

2) If the writer/checksummer knows it's a hint-bit-only-dirty page,
can it stuff a cookie checksum in it and not bother verifying?
Looses a bit of the CRC guarentee, especially around crashes which
is when we expect a torn page, but avoids the whole scary! scary!
Your database is corrupt! false-positives in the situation PostgreSQL
was specifically desinged to make not scary.

#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block?  If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the timestamp could be looked at by
whoever is reading the warning and know tht the block was written
shortly before the crash $X $PERIODS ago

The whole CRC is only a warning because we expect to get them if we
ever crashed means that the time when we most want them, we have to
assume they are bogus...  And to make matters worse, we don't even
know when the perioud of they may be bugus ends, unless we have a
way to methodically force PG through ever buffer in the database after
the crash...   And then that makes them very hard to consider
useful...


a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers