Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Robert Haas
On Wed, May 28, 2014 at 8:22 AM, ash a...@commandprompt.com wrote:
 None of that involves answering hypothetical questions; but what you
 want to do does, and that I think is the problem in a nutshell.

 In a nutshell I'd like PostgreSQL to just re-parse the *current* view
 definition.  Should that throw an error, user intervention will be
 required anyway, but most of the time it should just work.

What exactly do you mean by re-parse the current view definition?
The only form of the view definition we actually have is already
parsed into an internal form (see pg_rewrite) which, for the reasons
I've attempted to explain, is not easy to adapt to new column types.
I suppose we could deparse that definition and then reparse the
results, but that could lead to some very surprising consequences
(some of which are security-relevant).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication protocol documentation inconsistencies

2014-06-02 Thread Robert Haas
On Wed, May 28, 2014 at 6:51 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:
 Looking at
 http://www.postgresql.org/docs/devel/static/protocol-replication.html
 under START_REPLICATION it goes

 
 The payload of each CopyData message from server to the client contains
 a message of one of the following formats:

 If a slot's name is provided via slotname, it will be updated as
 replication progresses so that the server knows which WAL segments - and
 if hot_standby_feedback is on which transactions - are still needed by
 the standby.

 XLogData (B)
 ...

 Primary keepalive message (B)
 ...
 

 That second paragraph was inserted recently and doesn't make sense
 there.  It should be moved somewhere else.

 Hm. I am not sure why it doesn't make sense there? It's about the SLOT
 $slotname parameter to START_REPLICATION?

I think it would probably read better if we added that into the first
paragraph about START_REPLICATION, instead of having it down at the
end.  i.e. Instructs server to start streaming WAL, starting at WAL
position XXX/XXX. If TIMELINE option is specified, streaming starts on
timeline tli; otherwise, the server's current timeline is selected.
The server can reply with an error, e.g. if the requested section of
WAL has already been recycled.  On success, server responds with a
CopyBothResponse message, and then starts to stream WAL to the
frontend.  If a slot's name is provided via slotname, it will be
updated as replication progresses so that the server knows which WAL
segments - and if hot_standby_feedback is on which transactions - are
still needed by the standby.

This bit here:

The payload of each CopyData message from server to the client
contains a message of one of the following formats:

...is followed by a colon and needs to immediately proceed the list to
which it refers.

 More generally, it is weird that the message formats are described
 there, even though the rest of the protocol documentation only mentions
 the messages by name and then describes the formats later
 (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
 and
 http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
  For example, the meaning of the (B) code isn't described until two
 sections later.

 I am not sure that makes sense. These messages cannot be sent as
 toplevel messages - they're just describing the contents of the CopyBoth
 stream after START_REPLICATION has begun. It seems wierd to add these
 'subprotocol' messages to the toplevel protocol description.

I see your point, but I think Peter has a good point, too.  It would
be weird to document this mini-protocol mixed in with the main
protocol, and it's so short that adding a separate section for it
hardly makes sense, but it's still strange to have the mini-protocol
being documented before we've explained our conventions for how we
document wire protocol messages.  Forward references are best avoided,
especially implicit ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Documenting the Frontend/Backend Protocol update criteria

2014-06-02 Thread Koichi Suzuki
Jan Urbański made a presentation titled 'Postgres on the wire',
subtitle 'A look at the PostgreSQL wire protocol'.   I hope this
covers some of your interest.   Presentation slide deck is available
at
http://www.pgcon.org/2014/schedule/attachments/330_postgres-for-the-wire.pdf

Hope it helps;
---
Koichi Suzuki


2014-06-02 7:22 GMT+09:00 Mikko Tiihonen mikko.tiiho...@nitorcreations.com:
 Hi,

 Currently the criteria on updating the F/B protocol is undefined. This makes
 it hard to update the protocol going forward. It makes it also hard to write
 library/driver/application implementations that will be more future proof to
 future server versions.

 Ideally the documentation for 9.4 (backport?) would say what kind of things
 are allowed to change within the v3 protocol, and thus implies what kind of
 changes need a new v4 protocol. Is there some wishlist page of items to do
 differently for v4 already?

 I did find the following text in the documentation: binary representations
 for complex data types might change across server versions. But having more
 specific rules would help, especially since it seems to be there just to
 scare: so far changes have been strongly discouraged.

 An example to consider: some binary formats have flags (arrays) or version
 (jsonb) field. We should explicitly say that clients must reject any unknown
 bits/versions that they do not know about. This guarantees they detect small
 format updates instead of silently accepting then and possibly returning
 corrupt data.

 My motivation:

 Two years ago accidentally I opened a discussion on how to do updates to the
 binary encoding of data in the protocol [1]. I would like to reopen the
 discussion now since the jsonb 'binary' encoding is just a version '1' +
 text json. The result from the last discussion was that having a version or
 flags as part of the binary format is not enough, since drivers/libraries
 (fixable) and applications (unfixable) are depending on the current
 encoding.
 And if we add a new bit to the flags or bump the version number we will
 break backward compatibility.

 To summarize the previous discussion:
 - there are currently no written rules for modifying the binary encoding
 formats
 - bytea modification was done with a GUC, but GUC was seen as a bad solution
 in general
 - my proposal was to add a minor format version number was not good either
 since any per session state would be problematic for connection poolers

 [1]:
 http://grokbase.com/t/postgresql/pgsql-hackers/11bwhv1esa/add-minor-version-to-v3-protocol-to-allow-changes-without-breaking-backwards-compatibility


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spreading full-page writes

2014-06-02 Thread Fujii Masao
On Wed, May 28, 2014 at 1:10 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Tue, May 27, 2014 at 1:19 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, May 27, 2014 at 3:57 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
  The requirements we were discussing were around
 
  A) reducing WAL volume
  B) reducing foreground overhead of writing FPWs - which spikes badly
  after checkpoint and the overhead is paid by the user processes
  themselves
  C) need for FPWs during base backup
 
  So that gives us a few approaches
 
  * Compressing FPWs gives A
  * Background FPWs gives us B
 which look like we can combine both ideas
 
  * Double-buffering would give us A and B, but not C
 and would be incompatible with other two ideas

 Double-buffering would allow us to disable FPW safely but which would make
 a recovery slow.

 Is it due to the fact that during recovery, it needs to check the
 contents of double buffer as well as the page in original location
 for consistency or there is something else also which will lead
 to slow recovery?

 Won't DBW (double buffer write) reduce the need for number of
 pages that needs to be read from disk as compare to FPW which
 will suffice the performance degradation due to any other impact?

 IIUC in DBW mechanism, we need to have a temporary sequential
 log file of fixed size which will be used to write data before the data
 gets written to its actual location in tablespace.  Now as the temporary
 log file is of fixed size, the number of pages that needs to be read
 during recovery should be less as compare to FPW because in FPW
 it needs to read all the pages written in WAL log after last successful
 checkpoint.

Hmm... maybe I'm misunderstanding how WAL replay works in DBW case.
Imagine the case where we try to replay two WAL records for the page A and
the page has not been cached in shared_buffers yet. If FPW is enabled,
the first WAL record is FPW and firstly it's just read to shared_buffers.
The page doesn't neeed to be read from the disk. Then the second WAL record
will be applied.

OTOH, in DBW case, how does this example case work? I was thinking that
firstly we try to apply the first WAL record but find that the page A doesn't
exist in shared_buffers yet. We try to read the page from the disk, check
whether its CRC is valid or not, and read the same page from double buffer
if it's invalid. After reading the page into shared_buffers, the first WAL
record can be applied. Then the second WAL record will be applied. Is my
understanding right?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compression of full-page-writes

2014-06-02 Thread Fujii Masao
On Thu, May 29, 2014 at 7:21 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 29 May 2014 01:07, Bruce Momjian br...@momjian.us wrote:
 On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote:
 On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote:

  Also, compress_backup_block GUC needs to be merged with full_page_writes.
 
  Basically I agree with you because I don't want to add new GUC very 
  similar to
  the existing one.
 
  But could you imagine the case where full_page_writes = off. Even in this 
  case,
  FPW is forcibly written only during base backup. Such FPW also should be
  compressed? Which compression algorithm should be used? If we want to
  choose the algorithm for such FPW, we would not be able to merge those two
  GUCs. IMO it's OK to always use the best compression algorithm for such 
  FPW
  and merge them, though.

 I'd prefer a new name altogether

 torn_page_protection = 'full_page_writes'
 torn_page_protection = 'compressed_full_page_writes'
 torn_page_protection = 'none'

 this allows us to add new techniques later like

 torn_page_protection = 'background_FPWs'

 or

 torn_page_protection = 'double_buffering'

 when/if we add those new techniques

 Uh, how would that work if you want to compress the background_FPWs?
 Use compressed_background_FPWs?

 We've currently got 1 technique for torn page protection, soon to have
 2 and with a 3rd on the horizon and likely to receive effort in next
 release.

 It seems sensible to have just one parameter to describe the various
 techniques, as suggested. I'm suggesting that we plan for how things
 will look when we have the 3rd one as well.

 Alternate suggestions welcome.

Is even compression of double buffer worthwhile? If yes, what about separating
the GUC parameter into torn_page_protection and something like
full_page_compression? ISTM that any combination of settings of those parameters
can work.

torn_page_protection = 'FPW', 'background FPW', 'none', 'double buffer'
full_page_compression = 'no', 'pglz', 'lz4', 'snappy'

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread ash

Robert Haas robertmh...@gmail.com writes:

 On Wed, May 28, 2014 at 8:22 AM, ash a...@commandprompt.com wrote:
 None of that involves answering hypothetical questions; but what you
 want to do does, and that I think is the problem in a nutshell.

 In a nutshell I'd like PostgreSQL to just re-parse the *current* view
 definition.  Should that throw an error, user intervention will be
 required anyway, but most of the time it should just work.

 What exactly do you mean by re-parse the current view definition?

I mean do what the user will have to do in this situation anyway:

BEGIN;
DROP VIEW ...;
ALTER TABLE ...;
CREATE VIEW ...;
COMMIT;

Should this fail, the user will have to work around it, but most of the
time it could just work.

 The only form of the view definition we actually have is already
 parsed into an internal form (see pg_rewrite) which, for the reasons
 I've attempted to explain, is not easy to adapt to new column types.
 I suppose we could deparse that definition and then reparse the
 results, but that could lead to some very surprising consequences
 (some of which are security-relevant).

Like?

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-06-02 Thread Michael Paquier
On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 And here is the tool itself. It consists of two parts:

 1. Modifications to the backend to write the page images
 2. A post-processing tool to compare the logged images between master and
 standby.
Having that into Postgres at the disposition of developers would be
great, and I believe that it would greatly reduce the occurrence of
bugs caused by WAL replay during recovery. So, with the permission of
the author, I have been looking at this facility for a cleaner
integration into Postgres.

Roughly, this utility is made of three parts:
1) A set of masking functions that can be used on page images to
normalize them. This is used to put magic numbers or enforce flag
values to make page content consistent across nodes. This is for
example the case of the free space between pd_lower and pd_upper,
pd_flags, etc. Of course this depends on the type of page (btree,
heap, etc.).
2) Facility to memorize, analyze if they have been modified, and flush
page images to a dedicated file. This interacts with the buffer
manager mainly.
3) Facility to reorder page images within the same WAL record as
master/standby may not write them in the same order on a standby or a
master due to for example lock released in different order. This is
part of the binary analyzing the diffs between master and standby.

As of now, 2) is integrated in the backend, 1) and 3) are part of the
contrib module. However I am thinking that 1) and 2) should be done in
core using an ifdef similar to CLOBBER_FREED_MEMORY, to mask the page
images and write them in a dedicated file (in global/ ?), while 3)
would be fine as a separate binary in contrib/. An essential thing to
add would be to have a set of regression tests that developers and
buildfarm machines could directly use.

Perhaps there are parts of what is proposed here that could be made
more generalized, like the masking functions. So do not hesitate if
you have any opinion on the matter.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] backup_label revisited

2014-06-02 Thread Fujii Masao
On Thu, May 29, 2014 at 9:12 PM, Greg Stark st...@mit.edu wrote:
 So I ran into the case again where a system crashed while a hot backup
 was being taken. Postgres couldn't start up automatically because the
 backup_label was present. This has come up before e.g.
 http://www.postgresql.org/message-id/caazkufap1gxcojtyzch13rvevjevwro1vvfbysqtwgud9is...@mail.gmail.com
 but I believe no progress was made.

 I was trying to think if we could somehow identify if the backup_label
 was from a backup in progress or a restore in progress. Obvious
 choices like putting the server ip address in it are obviously not
 going to work for several reasons.

 However, at least on Linux wouldn't it be sufficient to put the inode
 number of the backup_label file in the backup_label? If it's still the
 same inode then it's just restarting, not a restore since afaik
 there's no way for tar or the like to recreate the file with the same
 inode on any filesystem.

Could you let me know the link to the page explaining this?

 That would even protect against another
 restore on the same host.

What about the case where we restore the backup to another server and
start the recovery? In this case, ISTM inode can be the same. No?


Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication protocol documentation inconsistencies

2014-06-02 Thread Alvaro Herrera
Robert Haas wrote:
 On Wed, May 28, 2014 at 6:51 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:

  More generally, it is weird that the message formats are described
  there, even though the rest of the protocol documentation only mentions
  the messages by name and then describes the formats later
  (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
  and
  http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
   For example, the meaning of the (B) code isn't described until two
  sections later.
 
  I am not sure that makes sense. These messages cannot be sent as
  toplevel messages - they're just describing the contents of the CopyBoth
  stream after START_REPLICATION has begun. It seems wierd to add these
  'subprotocol' messages to the toplevel protocol description.
 
 I see your point, but I think Peter has a good point, too.  It would
 be weird to document this mini-protocol mixed in with the main
 protocol, and it's so short that adding a separate section for it
 hardly makes sense, but it's still strange to have the mini-protocol
 being documented before we've explained our conventions for how we
 document wire protocol messages.  Forward references are best avoided,
 especially implicit ones.

IMHO this mini-protocol and the CopyBoth subprotocol should be
documented in full in a separate subsection -- maybe even have its own
index entry which points to one place that documents the whole thing.
We can't expect people to read the whole FE/BE protocol chapter to hunt
for the hidden references to the replication protocol or the CopyBoth
stream.

I would put aside the subsection too small to make sense argument.
I don't think that matters much.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 What exactly do you mean by re-parse the current view definition?
 The only form of the view definition we actually have is already
 parsed into an internal form (see pg_rewrite) which, for the reasons
 I've attempted to explain, is not easy to adapt to new column types.
 I suppose we could deparse that definition and then reparse the
 results, but that could lead to some very surprising consequences
 (some of which are security-relevant).

Note that even if we had the original CREATE VIEW text, as well as
the creation-time settings of all relevant GUCs (search_path being
the most obvious, but not the only one), reparsing like this could
still be risky; it's not the conversion to internal form that's the
real issue here.

A simple example is that
  x || 'foo'
means something quite different if x is a tsvector than if x is text,
and something different from those if x is an array, and something
different yet again if x is bit varying.  Some of those meanings
are close enough that the user might have wanted the substitution,
but others perhaps not.

Or for another example, if we have foo(x) calling foo(int),
and x's type is changed from int4 to int8, should we insert a
cast so that the same foo() is still called?  Or should we allow
the called function to be replaced by foo(int8), foo(numeric),
foo(float8), if one of those exist?

I see the OP's point that the user is likely to try a naive manual
conversion first, but at least then it's on his head to be aware
of whether he got the semantics he wants.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-02 Thread Fujii Masao
On Thu, May 29, 2014 at 3:02 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 28, 2014 at 10:49 PM, Fujii Masao masao.fu...@gmail.com wrote:
 You're concerned about the scenario using pg_upgrade? I'm not sure the detail
 of pg_upgrade. But if it doesn't work properly, we should have gotten
 the trouble

 I'm not worried about pg_upgrade, because by design pg_stat_statements
 will discard stats files that originated in earlier versions. However,
 I don't see a need to change pg_stat_statements to serialize its
 statistics to disk in the pg_stat directory before we branch off 9.4.
 As you mentioned, it's harmless.

Yeah, that's an idea. OTOH, there is no *strong* reason to postpone
the fix to 9.5. So I just feel inclined to apply the fix now...

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Jsonb: jbvBinary usage in the convertJsonbValue?

2014-06-02 Thread Robert Haas
On Thu, May 29, 2014 at 7:34 AM, Dmitry Dolgov 9erthali...@gmail.com wrote:
 I'm little confused by the convertJsonbValue functon at jsonb_utils.c
 Maybe I misunderstood something, so I need help =)

 if (IsAJsonbScalar(val) || val-type == jbvBinary)
 convertJsonbScalar(buffer, header, val);

 As I can see, the convertJsonbScalar function is used for scalar and binary
 jsonb values. But this function doesn't handle the jbvBinary type.

There definitely seems to be an oversight here of some kind.  Either
convertJsonbValue() shouldn't be calling convertJsonbScalar() with an
object of type jbvBinary, or convertJsonbScalar() should know how to
handle that case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Robert Haas
On Mon, Jun 2, 2014 at 8:52 AM, ash a...@commandprompt.com wrote:
 On Wed, May 28, 2014 at 8:22 AM, ash a...@commandprompt.com wrote:
 None of that involves answering hypothetical questions; but what you
 want to do does, and that I think is the problem in a nutshell.

 In a nutshell I'd like PostgreSQL to just re-parse the *current* view
 definition.  Should that throw an error, user intervention will be
 required anyway, but most of the time it should just work.

 What exactly do you mean by re-parse the current view definition?

 I mean do what the user will have to do in this situation anyway:

 BEGIN;
 DROP VIEW ...;
 ALTER TABLE ...;
 CREATE VIEW ...;
 COMMIT;

 Should this fail, the user will have to work around it, but most of the
 time it could just work.

You're either missing or choosing to ignore the point that I'm making,
which is that we *don't have* the text form of the view anywhere.  If
you try to get implement what you're proposing, I'm fairly certain
that you'll find that you can't.  I agree that it would be nice if
there were to make that just work; I've wished for it myself - but I
don't see a reasonable way to implement it.

 The only form of the view definition we actually have is already
 parsed into an internal form (see pg_rewrite) which, for the reasons
 I've attempted to explain, is not easy to adapt to new column types.
 I suppose we could deparse that definition and then reparse the
 results, but that could lead to some very surprising consequences
 (some of which are security-relevant).

 Like?

Tom's email covers this point, so I won't repeat what he said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-02 Thread Andres Freund
On 2014-06-02 22:59:55 +0900, Fujii Masao wrote:
 On Thu, May 29, 2014 at 3:02 PM, Peter Geoghegan p...@heroku.com wrote:
  On Wed, May 28, 2014 at 10:49 PM, Fujii Masao masao.fu...@gmail.com wrote:
  You're concerned about the scenario using pg_upgrade? I'm not sure the 
  detail
  of pg_upgrade. But if it doesn't work properly, we should have gotten
  the trouble
 
  I'm not worried about pg_upgrade, because by design pg_stat_statements
  will discard stats files that originated in earlier versions. However,
  I don't see a need to change pg_stat_statements to serialize its
  statistics to disk in the pg_stat directory before we branch off 9.4.
  As you mentioned, it's harmless.
 
 Yeah, that's an idea. OTOH, there is no *strong* reason to postpone
 the fix to 9.5. So I just feel inclined to apply the fix now...

+1 for fixing it now.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 8:52 AM, ash a...@commandprompt.com wrote:
 Should this fail, the user will have to work around it, but most of the
 time it could just work.

 You're either missing or choosing to ignore the point that I'm making,
 which is that we *don't have* the text form of the view anywhere.

Even if we did, I don't think it'd affect this decision.

The real problem in my mind is one of user expectations.  If the database
silently does something behind your back, people expect that that action
will be *right* and they don't have to worry about it.  I don't think
that automatically reparsing views has much chance of clearing that bar.
In much simpler, non-extensible SQL systems it could probably work, but
for better or worse Postgres has gone all-in on datatype extensibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Jsonb: jbvBinary usage in the convertJsonbValue?

2014-06-02 Thread Andrew Dunstan


On 06/02/2014 10:02 AM, Robert Haas wrote:

On Thu, May 29, 2014 at 7:34 AM, Dmitry Dolgov 9erthali...@gmail.com wrote:

I'm little confused by the convertJsonbValue functon at jsonb_utils.c
Maybe I misunderstood something, so I need help =)


if (IsAJsonbScalar(val) || val-type == jbvBinary)
 convertJsonbScalar(buffer, header, val);

As I can see, the convertJsonbScalar function is used for scalar and binary
jsonb values. But this function doesn't handle the jbvBinary type.

There definitely seems to be an oversight here of some kind.  Either
convertJsonbValue() shouldn't be calling convertJsonbScalar() with an
object of type jbvBinary, or convertJsonbScalar() should know how to
handle that case.




Yes, I've just been looking at that. I think this is probably a hangover 
from when these routines were recast to some extent. Given that we're 
not seeing any errors from it, I'd be inclined to remove the the || 
val-type == jbvBinary part. One of the three call sites to 
convertJsonbValue asserts that this can't be true, and doing so doesn't 
result in a regression failure.


Peter and Teodor, comments?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-06-02 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I'm not quite there with inner joins yet. I'm still getting my head around
 just where the join quals are actually stored.

TBH I think that trying to do anything at all for inner joins is probably
a bad idea.  The cases where the optimization could succeed are so narrow
that it's unlikely to be worth adding cycles to every query to check.

The planning cost of all this is likely to be a concern anyway; but
if you can show that you don't add anything unless there are some outer
joins in the query, you can at least overcome objections about possibly
adding significant overhead to trivial queries.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-06-02 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 David Rowley dgrowle...@gmail.com writes:
  I'm not quite there with inner joins yet. I'm still getting my head around
  just where the join quals are actually stored.
 
 TBH I think that trying to do anything at all for inner joins is probably
 a bad idea.  The cases where the optimization could succeed are so narrow
 that it's unlikely to be worth adding cycles to every query to check.

I agree that we don't want to add too many cycles to trivial queries but
I don't think it's at all fair to say that FK-check joins are a narrow
use-case and avoiding that join could be a very nice win.

 The planning cost of all this is likely to be a concern anyway; but
 if you can show that you don't add anything unless there are some outer
 joins in the query, you can at least overcome objections about possibly
 adding significant overhead to trivial queries.

I'm not quite buying this.  We're already beyond really trivial queries
since we're talking about joins and then considering how expensive joins
can be, putting in a bit of effort to eliminate one would be time well
worth spending, imv.

In any case, I'd certainly suggest David continue to develop this and
then we can look at measuring the cost on cases where it was time wasted
and on cases where it helps.  We may also be able to come up with ways
to short-circuit the test and thereby minimize the cost in cases where
it won't help.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread ash

Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 8:52 AM, ash a...@commandprompt.com wrote:
 Should this fail, the user will have to work around it, but most of the
 time it could just work.

 You're either missing or choosing to ignore the point that I'm making,
 which is that we *don't have* the text form of the view anywhere.

 Even if we did, I don't think it'd affect this decision.

 The real problem in my mind is one of user expectations.  If the database
 silently does something behind your back, people expect that that action
 will be *right* and they don't have to worry about it.  I don't think
 that automatically reparsing views has much chance of clearing that bar.
 In much simpler, non-extensible SQL systems it could probably work, but
 for better or worse Postgres has gone all-in on datatype extensibility.

Alright, I think I can let it go now.  It's just that the behavior was
very counter-intuitive to me (and I guess a lot others) at first.

Thanks all for your time and in-depth explanation!

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Jsonb: jbvBinary usage in the convertJsonbValue?

2014-06-02 Thread Andrew Dunstan


On 06/02/2014 10:22 AM, Andrew Dunstan wrote:


On 06/02/2014 10:02 AM, Robert Haas wrote:
On Thu, May 29, 2014 at 7:34 AM, Dmitry Dolgov 
9erthali...@gmail.com wrote:

I'm little confused by the convertJsonbValue functon at jsonb_utils.c
Maybe I misunderstood something, so I need help =)


if (IsAJsonbScalar(val) || val-type == jbvBinary)
 convertJsonbScalar(buffer, header, val);
As I can see, the convertJsonbScalar function is used for scalar and 
binary

jsonb values. But this function doesn't handle the jbvBinary type.

There definitely seems to be an oversight here of some kind. Either
convertJsonbValue() shouldn't be calling convertJsonbScalar() with an
object of type jbvBinary, or convertJsonbScalar() should know how to
handle that case.




Yes, I've just been looking at that. I think this is probably a 
hangover from when these routines were recast to some extent. Given 
that we're not seeing any errors from it, I'd be inclined to remove 
the the || val-type == jbvBinary part. One of the three call sites 
to convertJsonbValue asserts that this can't be true, and doing so 
doesn't result in a regression failure.


Peter and Teodor, comments?




Thinking about it a bit more, ISTM this should be ok, since we convert a 
JsonbValue to Jsonb in a depth-first recursive pattern. We should 
certainly add some comments along these lines to explain why the 
jbvbinary case shouldn't arise.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-06-02 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 TBH I think that trying to do anything at all for inner joins is probably
 a bad idea.  The cases where the optimization could succeed are so narrow
 that it's unlikely to be worth adding cycles to every query to check.

 I agree that we don't want to add too many cycles to trivial queries but
 I don't think it's at all fair to say that FK-check joins are a narrow
 use-case and avoiding that join could be a very nice win.

[ thinks for a bit... ]  OK, I'd been thinking that to avoid a join the
otherwise-unreferenced table would have to have a join column that is both
unique and the referencing side of an FK to the other table's join column.
But after consuming more caffeine I see I got that backwards and it would
need to be the *referenced* side of the FK, which is indeed a whole lot
more plausible case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)

2014-06-02 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 06/01/2014 05:35 PM, Tom Lane wrote:
 I did a little bit of experimentation and determined that none of the
 LATIN1 characters are significantly more portable than what we've got:
 for instance a-acute fails to convert into 16 of the 33 supported
 server-side encodings (versus 17 failures for U+0080).  However,
 non-breaking space is significantly better: it converts into all our
 supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
 It seems likely that we won't do better than that except with a basic
 ASCII character.

 Yeah, I just looked at the copyright symbol, with similar results.

I'd been hopeful about that one too, but nope :-(

 Let's just stick to ASCII.

The more I think about it, the more I think that using a plain-ASCII
character would defeat most of the purpose of the test.  Non-breaking
space seems like the best bet here, not least because it has several
different representations among the encodings we support.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery testing for beta

2014-06-02 Thread Jeff Janes
On Fri, May 30, 2014 at 8:09 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Sat, May 31, 2014 at 3:51 AM, Jeff Janes jeff.ja...@gmail.com wrote:
  On Thu, May 29, 2014 at 8:15 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  On Thu, May 29, 2014 at 10:09 PM, Jeff Janes jeff.ja...@gmail.com
 wrote:
  
   What features in 9.4 need more beta testing for recovery?
 
  Another feature which have interaction with recovery is reduced WAL
  for Update operation:
 
 http://www.postgresql.org/message-id/e1wnqjm-0003cz...@gemulon.postgresql.org
 
  Commit: a3115f
 
 
  It looks like this is something which is always on, so it should be
 getting a good test without me taking any special steps.  But is there some
 way, for example something in xlogdump, to assess how often this is getting
 activated?

 Currently there is no simple way to get this information, but with
 attached simple patch you can get this information by pg_xlogdump.

 It will give you information in below way:

 rmgr: Heaplen (rec/tot): 47/79, tx:690, lsn:
 0/0375B4A0,
  prev 0/0375B470, bkp: , desc: hot_update: rel 1663/12125/24580; tid
 0/1 xma
 x 690 ; new tid 0/2 xmax 0; *compressed tuple:* *suffix encoded*

 rmgr: Heaplen (rec/tot): 53/85, tx:691, lsn:
 0/0375B520,
  prev 0/0375B4F0, bkp: , desc: hot_update: rel 1663/12125/24580; tid
 0/2 xma
 x 691 ; new tid 0/3 xmax 0; *compressed tuple: prefix encoded*

 rmgr: Heaplen (rec/tot): 56/88, tx:692, lsn:
 0/0375B5A8,
  prev 0/0375B578, bkp: , desc: hot_update: rel 1663/12125/24580; tid
 0/3 xma
 x 692 ; new tid 0/4 xmax 0; *uncompressed tuple*

 I think this is useful information and can be even included in core
 code.



Thanks.

Non-HOT updates can also be compressed, if they happen to land in the same
page as the old version, so I copied that code into the non-HOT update
section as well.

Taking a snapshot of a running pg_xlog directory, I found 25241
uncompressed and 14179 compressed tuples, so I think this feature is
getting exercised, though mostly in the non-HOT form.

Some side notes:

GNU make does not realize that pg_xlogdump depends
on src/backend/access/rmgrdesc/heapdesc.c.  (I don't know how or why it has
that dependency, but changes did not take effect with a simple make
install) Is that a known issue?  Is there someway to fix it?

Also, pg_xlogdump -p  insists on being given a start position.  I would
be nice if it could just find the first file in the given directory.  Any
reason it can't do that, other than just that no one implemented it yet?

Thanks,

Jeff


Re: [HACKERS] recovery testing for beta

2014-06-02 Thread Andres Freund
Hi,

On 2014-06-02 09:03:25 -0700, Jeff Janes wrote:
 On Fri, May 30, 2014 at 8:09 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
 
  On Sat, May 31, 2014 at 3:51 AM, Jeff Janes jeff.ja...@gmail.com wrote:
   On Thu, May 29, 2014 at 8:15 PM, Amit Kapila amit.kapil...@gmail.com
  wrote:
   On Thu, May 29, 2014 at 10:09 PM, Jeff Janes jeff.ja...@gmail.com
  wrote:
   
What features in 9.4 need more beta testing for recovery?
  
   Another feature which have interaction with recovery is reduced WAL
   for Update operation:
  
  http://www.postgresql.org/message-id/e1wnqjm-0003cz...@gemulon.postgresql.org
  
   Commit: a3115f
  
  
   It looks like this is something which is always on, so it should be
  getting a good test without me taking any special steps.  But is there some
  way, for example something in xlogdump, to assess how often this is getting
  activated?
 
  Currently there is no simple way to get this information, but with
  attached simple patch you can get this information by pg_xlogdump.
 
  It will give you information in below way:
 
  rmgr: Heaplen (rec/tot): 47/79, tx:690, lsn:
  0/0375B4A0,
   prev 0/0375B470, bkp: , desc: hot_update: rel 1663/12125/24580; tid
  0/1 xma
  x 690 ; new tid 0/2 xmax 0; *compressed tuple:* *suffix encoded*
 
  rmgr: Heaplen (rec/tot): 53/85, tx:691, lsn:
  0/0375B520,
   prev 0/0375B4F0, bkp: , desc: hot_update: rel 1663/12125/24580; tid
  0/2 xma
  x 691 ; new tid 0/3 xmax 0; *compressed tuple: prefix encoded*
 
  rmgr: Heaplen (rec/tot): 56/88, tx:692, lsn:
  0/0375B5A8,
   prev 0/0375B578, bkp: , desc: hot_update: rel 1663/12125/24580; tid
  0/3 xma
  x 692 ; new tid 0/4 xmax 0; *uncompressed tuple*
 
  I think this is useful information and can be even included in core
  code.

I'd like to include something, but I think those are a bit long...

 Non-HOT updates can also be compressed, if they happen to land in the same
 page as the old version, so I copied that code into the non-HOT update
 section as well.

Right.

 GNU make does not realize that pg_xlogdump depends
 on src/backend/access/rmgrdesc/heapdesc.c.  (I don't know how or why it has
 that dependency, but changes did not take effect with a simple make
 install) Is that a known issue?  Is there someway to fix it?

Hm. I can't reproduce this here. A simple 'touch heapdesc.c' triggers a
rebuild of pg_xlogdump for me. Could you include the make output?

 Also, pg_xlogdump -p  insists on being given a start position.  I would
 be nice if it could just find the first file in the given directory.  Any
 reason it can't do that, other than just that no one implemented it yet?

It actually should accept getting a file passed instead of -s/-e.
pg_xlogdump [OPTION]... [STARTSEG [ENDSEG]]
That doesn't work for you?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery testing for beta

2014-06-02 Thread Jeff Janes
On Mon, Jun 2, 2014 at 9:14 AM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi,

 On 2014-06-02 09:03:25 -0700, Jeff Janes wrote:
 
  GNU make does not realize that pg_xlogdump depends
  on src/backend/access/rmgrdesc/heapdesc.c.  (I don't know how or why it
 has
  that dependency, but changes did not take effect with a simple make
  install) Is that a known issue?  Is there someway to fix it?

 Hm. I can't reproduce this here. A simple 'touch heapdesc.c' triggers a
 rebuild of pg_xlogdump for me. Could you include the make output?


Sorry, total user error.  I forgot that pg_xlogdump was in contrib, not in
core-core.  make -C contrib install does rebuild it. I guess what I
really want then is a target that builds and installs src and contrib but
not docs, so I could train my fingers to use that.



  Also, pg_xlogdump -p  insists on being given a start position.  I
 would
  be nice if it could just find the first file in the given directory.  Any
  reason it can't do that, other than just that no one implemented it yet?

 It actually should accept getting a file passed instead of -s/-e.
 pg_xlogdump [OPTION]... [STARTSEG [ENDSEG]]
 That doesn't work for you?


The STARTSEG does not seem to be optional, unless you specify both -p and
-s.  I don't think there is a very good command-line synopsis for such
conditional dependencies--I think you are following the tradition here, in
that if something is optional under any normal circumstances then it is put
in brackets.

If I give for STARTSEG a path to the data directory or the pg_xlog
directory instead of to a specific xlog file then I get bizarre error
messages like:

pg_xlogdump: FATAL:  could not find file 00DA00C2C9150080: No
such file or directory

(There is no timeline DA, nor C2C915 segments).  So STARTSEG cannot be a
directory, which makes sense though the error message could be better.

If I specify the full path of the first real log file as STARTSEG, it
works.  But it is annoying to have to figure out what the first valid file
in the directory is, then hope it hasn't changed while I type the command.
 It is less annoying on an idle system or a snapshot, but it even there I'd
rather not provide information that can be safely inferred.

If there is a magic combination of command line to do what I want, I can't
find it.  If there isn't, what would be the right way to implement it?  -p
without a -s would seem like the most obvious, but just giving the
directory as the sole option would also be intuitive to me.

Cheers,

Jeff


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Robert Haas
On Mon, Jun 2, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 8:52 AM, ash a...@commandprompt.com wrote:
 Should this fail, the user will have to work around it, but most of the
 time it could just work.

 You're either missing or choosing to ignore the point that I'm making,
 which is that we *don't have* the text form of the view anywhere.

 Even if we did, I don't think it'd affect this decision.

 The real problem in my mind is one of user expectations.  If the database
 silently does something behind your back, people expect that that action
 will be *right* and they don't have to worry about it.  I don't think
 that automatically reparsing views has much chance of clearing that bar.

I agree, but I think it's important to note that Alex's complaint is
not unique - the way things work now is a real source of frustration
for users.  In a previous job, I wrote a schema-upgrade script that
dropped all of the views in reverse creation order, applied the schema
updates, and then recreated all the views. This worked, but it was a
lot of hassle that I would have preferred to avoid, and in a
higher-volume application, simultaneously grabbing exclusive locks on
a large number of critical views would have been a non-starter.  In
the job before that, I did the same thing manually, which was no fun
at all.  This was actually what posted me to write one of my first
patches, committed by Bruce as
ff1ea2173a92dea975d399a4ca25723f83762e55.

From a technical standpoint, I'm not very sure what to do to further
improve the situation - which I will broadly characterize as view
dependency hell - but if I did have such an idea I might be willing
to take a modest risk of user confusion if it seemed likely to also
reduce user frustration.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery testing for beta

2014-06-02 Thread Andres Freund
Hi,

On 2014-06-02 10:15:19 -0700, Jeff Janes wrote:
   Also, pg_xlogdump -p  insists on being given a start position.  I
  would
   be nice if it could just find the first file in the given directory.  Any
   reason it can't do that, other than just that no one implemented it yet?
 
  It actually should accept getting a file passed instead of -s/-e.
  pg_xlogdump [OPTION]... [STARTSEG [ENDSEG]]
  That doesn't work for you?
 
 
 The STARTSEG does not seem to be optional, unless you specify both -p and
 -s.  I don't think there is a very good command-line synopsis for such
 conditional dependencies--I think you are following the tradition here, in
 that if something is optional under any normal circumstances then it is put
 in brackets.

Maybe I have misunderstood what you actually want. You said:

 I would be nice if it could just find the first file in the given
 directory.

You mean it should scan the pg_xlog directory and find the numerically
oldest segment and start to decode from there?

 If I give for STARTSEG a path to the data directory or the pg_xlog
 directory instead of to a specific xlog file then I get bizarre error
 messages like:

 pg_xlogdump: FATAL:  could not find file 00DA00C2C9150080: No
 such file or directory
 
 (There is no timeline DA, nor C2C915 segments).  So STARTSEG cannot be a
 directory, which makes sense though the error message could be better.

Hm. Yes, that's clearly suboptimal. Will look if I can make it return a
more sensible error message.

 If I specify the full path of the first real log file as STARTSEG, it
 works.  But it is annoying to have to figure out what the first valid file
 in the directory is, then hope it hasn't changed while I type the command.
  It is less annoying on an idle system or a snapshot, but it even there I'd
 rather not provide information that can be safely inferred.

I don't think it actually can be safely inferred in a trivial
manner... I guess a mode that reads the control file and starts with
values - including the timelineid - from there would be helpful.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The real problem in my mind is one of user expectations.  If the database
 silently does something behind your back, people expect that that action
 will be *right* and they don't have to worry about it.  I don't think
 that automatically reparsing views has much chance of clearing that bar.

 I agree, but I think it's important to note that Alex's complaint is
 not unique - the way things work now is a real source of frustration
 for users.

Oh, I quite agree with that.  My concern here has to do with automatically
and silently making changes that we can't be very sure will meet the
user's expectations.  Perhaps what we need is some kind of UI/API design
whereby the user can inspect/modify/approve the semantic changes in
advance of pushing the red button.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Jsonb: jbvBinary usage in the convertJsonbValue?

2014-06-02 Thread Peter Geoghegan
On Mon, Jun 2, 2014 at 7:22 AM, Andrew Dunstan and...@dunslane.net wrote:
 Yes, I've just been looking at that. I think this is probably a hangover
 from when these routines were recast to some extent. Given that we're not
 seeing any errors from it, I'd be inclined to remove the the || val-type
 == jbvBinary part. One of the three call sites to convertJsonbValue asserts
 that this can't be true, and doing so doesn't result in a regression
 failure.

Well, I guess the latter condition should be moved, since clearly the
jbvBinary case isn't handled within convertJsonbScalar(). It would be
mildly preferable to have a more accurate error message (the
convertJsonbValue() error) if that can't happen condition should
ever happen.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Jun 2, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  The real problem in my mind is one of user expectations.  If the database
  silently does something behind your back, people expect that that action
  will be *right* and they don't have to worry about it.  I don't think
  that automatically reparsing views has much chance of clearing that bar.
[...]
 From a technical standpoint, I'm not very sure what to do to further
 improve the situation - which I will broadly characterize as view
 dependency hell - but if I did have such an idea I might be willing
 to take a modest risk of user confusion if it seemed likely to also
 reduce user frustration.

Tom's point goes back to what I was trying to drive at originally-
people should have to ask for this.  Perhaps we can provide a way for
them to ask which is explicit enough that they understand this might
not do exactly what you think it does, akin to what happens today with
a drop-and-recreate-everything approach.  'CASCADE' might not be
sufficient to meet that, maybe 'CASCADE REBUILD' or something?

Of course, there is a question about if it's worth it to keep around the
exact text of each CREATE VIEW and build all this infrastructure for
something which will only work properly in a specific subset of cases
and in many others could break silently, essentially installing a very
handy looking foot-gun.  Not sure I like that either.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Robert Haas robertmh...@gmail.com writes:
  I agree, but I think it's important to note that Alex's complaint is
  not unique - the way things work now is a real source of frustration
  for users.
 
 Oh, I quite agree with that.  My concern here has to do with automatically
 and silently making changes that we can't be very sure will meet the
 user's expectations.  Perhaps what we need is some kind of UI/API design
 whereby the user can inspect/modify/approve the semantic changes in
 advance of pushing the red button.

No clue how we'd manage to do that with just core, but could we provide
something which would make it easier for pgAdmin-or-similar to do that?

We could at least spit out NOTICE's and then see if the user decides to
commit the change but I'm not sure how well that'd really work.

In general, I like the idea..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Josh Berkus
On 06/02/2014 10:40 AM, Stephen Frost wrote:
 Tom's point goes back to what I was trying to drive at originally-
 people should have to ask for this.  Perhaps we can provide a way for
 them to ask which is explicit enough that they understand this might
 not do exactly what you think it does, akin to what happens today with
 a drop-and-recreate-everything approach.  'CASCADE' might not be
 sufficient to meet that, maybe 'CASCADE REBUILD' or something?

I think CASCADE is sufficient; what else could a user mean by ALTER
TABLE ... CASCADE?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Andres Freund
On 2014-06-02 10:48:02 -0700, Josh Berkus wrote:
 On 06/02/2014 10:40 AM, Stephen Frost wrote:
  Tom's point goes back to what I was trying to drive at originally-
  people should have to ask for this.  Perhaps we can provide a way for
  them to ask which is explicit enough that they understand this might
  not do exactly what you think it does, akin to what happens today with
  a drop-and-recreate-everything approach.  'CASCADE' might not be
  sufficient to meet that, maybe 'CASCADE REBUILD' or something?
 
 I think CASCADE is sufficient; what else could a user mean by ALTER
 TABLE ... CASCADE?

Please also change the table's children.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Andres Freund
On 2014-06-02 13:40:32 -0400, Stephen Frost wrote:
 Of course, there is a question about if it's worth it to keep around the
 exact text of each CREATE VIEW and build all this infrastructure for
 something which will only work properly in a specific subset of cases
 and in many others could break silently, essentially installing a very
 handy looking foot-gun.  Not sure I like that either.

I don't think any solution that requires that is going to be
workable. It'll break in far too many cases that currently are handled
transparently. Starting with RENAME, followed by lots of other
stuff. It'd also be very confusing because pg_dump doesn't use it...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery testing for beta

2014-06-02 Thread Alvaro Herrera
Jeff Janes wrote:

 GNU make does not realize that pg_xlogdump depends
 on src/backend/access/rmgrdesc/heapdesc.c.  (I don't know how or why it has
 that dependency, but changes did not take effect with a simple make
 install) Is that a known issue?  Is there someway to fix it?

Uh, you're right, it's not.  How odd.  I think this might be an obscure
pgxs bug.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Alvaro Herrera
Tom Lane wrote:

 Oh, I quite agree with that.  My concern here has to do with automatically
 and silently making changes that we can't be very sure will meet the
 user's expectations.  Perhaps what we need is some kind of UI/API design
 whereby the user can inspect/modify/approve the semantic changes in
 advance of pushing the red button.

I think that instead of forcing the user to append a CASCADE keyword at
the end of the command, it could perhaps return a bunch of commands to
alter all views.  The user would inspect those commands and fix those
that need fixing, then rerun the whole bunch.  I would imagine a UI
similar to git rebase, which first gives you a list of things to do,
which you can edit, and upon save-exit the final list of commands is
executed.  Any error during the execution abort the entire transaction,
so if the user makes mistakes the thing is started afresh.

If you have a complex maze of views, I think this (or something similar
that gives enough leeway to the user) is the only way to enable a
nontrivial alteration of one of the tables or views at the bottom.
There is no way that we're going to allow automatic schema changes
otherwise.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Robert Haas
On Mon, Jun 2, 2014 at 1:40 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Jun 2, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  The real problem in my mind is one of user expectations.  If the database
  silently does something behind your back, people expect that that action
  will be *right* and they don't have to worry about it.  I don't think
  that automatically reparsing views has much chance of clearing that bar.
 [...]
 From a technical standpoint, I'm not very sure what to do to further
 improve the situation - which I will broadly characterize as view
 dependency hell - but if I did have such an idea I might be willing
 to take a modest risk of user confusion if it seemed likely to also
 reduce user frustration.

 Tom's point goes back to what I was trying to drive at originally-
 people should have to ask for this.

Well, my point is that we don't yet know what this is, so trying to
decide on whether users should be forced to request it or in exactly
what manner seems like putting the cart before the horse.  We may well
need some syntax, but it's too soon to decide what it is.

FWIW, I don't think reparsing the original view-text is even remotely
plausible.  The fact that views stay glued to the same objects even of
those objects are renamed is a pretty handy property of the current
system, and any sort of reparse-from-scratch technique would give that
up.  Deparse-and-reparse might be better, but I'll bet that has too
many problems to be viable, too (even if I haven't yet thought of what
they are).  For better or for worse, I think the best we're likely to
be able to do is somehow manipulate the already-parsed rewrite rule.
I don't have any great ideas about how to do that, either, but it
seems less problematic than going back to the SQL representation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Clustering in MADlib - status update

2014-06-02 Thread Maxence Ahlouche
Hi!

2014-06-02 19:16 GMT+02:00 Hai Qian hq...@gopivotal.com:

 I like the second option for refactoring the code. I think it is doable.

 And where is your code on Github?


It's not on Github, but on my own Gitlab (a self-hosted open-source
alternative to github). You can find it here [0]. I'm using two repos: one
is a clone of madlib, the other contains my reports, my test script and
other stuff.

[0] http://git.viod.eu/public/

-- 
Maxence Ahlouche
06 06 66 97 00


Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD

2014-06-02 Thread Robert Haas
On Thu, May 29, 2014 at 1:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 One thing that concerns me is that we already had the problem that users
 creating the uuid-ossp extension had to double-quote the name because of
 the dash, and we have regularly questioned the viability of the
 uuid-ossp codebase.

 Now that we know we have to support alternatives, we are changing the
 build API to support those alternatives, but doing nothing to decouple
 the extension name from uuid-ossp and the dash issue.

 Well, if you've got a proposal for how to rename the extension without
 creating major compatibility problems, let's hear it.

 Seems this would be the logical time to just break compatibility and get
 a sane API for UUID generation.

 Most people think the sane API would be to put the functionality in
 core, and forget about any extension name at all.  The compatibility
 problems with that approach aren't exactly trivial either, but I suspect
 that's where we'll end up in the long run.  So I'm not that excited about
 kluge solutions for renaming the extension.

FWIW, I'm with Bruce.  I see no compelling reason to put this
functionality in core, but it sure would be nice to have it named
something less dumb.  (Don't ask me how to get there, because I
dunno.)

Also FWIW, I agree that adding this portability code now was the right
decision.  I can't see that this introduced significant instability
into PostgreSQL; if anything, it probably decreased it, because before
we had each packager hacking on it on their own, and coming up with
their own ad-hoc solutions.  Now we have an official way of doing this
which we can at least hope will be universally adopted.  Perhaps Tom
could have waited a bit longer before committing to make sure all
voices were heard, but had discussion ensued I would have voted
strongly for the change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Oh, I quite agree with that.  My concern here has to do with automatically
 and silently making changes that we can't be very sure will meet the
 user's expectations.  Perhaps what we need is some kind of UI/API design
 whereby the user can inspect/modify/approve the semantic changes in
 advance of pushing the red button.

 I think that instead of forcing the user to append a CASCADE keyword at
 the end of the command, it could perhaps return a bunch of commands to
 alter all views.  The user would inspect those commands and fix those
 that need fixing, then rerun the whole bunch.  I would imagine a UI
 similar to git rebase, which first gives you a list of things to do,
 which you can edit, and upon save-exit the final list of commands is
 executed.  Any error during the execution abort the entire transaction,
 so if the user makes mistakes the thing is started afresh.

I think we might be better off thinking about what support would we
need to provide to allow a tool for this to be written? than how would
we do this entirely inside the backend?.  SQL does not deal in user
interfaces very well, and we shouldn't try to make it do so.

I doubt that we need much help from ALTER TABLE itself.  I can
envision the tool's red button emitting commands like this:

 begin;
 create or replace view v1 as select ... dummy definition ...;
 create or replace view v2 as select ... dummy definition ...;
 ...
 alter table t alter column type ...;
 create or replace view v1 as select ... new definition ...;
 create or replace view v2 as select ... new definition ...;
 ...
 commit;

where the dummy definitions have no reference to t; they could probably
just be SELECT null::type1 as colname1, null::type2 as colname2, ...
In this way the ALTER TABLE itself would not need any change from current
behavior.  (But probably we need to allow CREATE OR REPLACE VIEW to change
the output column types of the view, if it is currently unreferenced.)

We might want to add some capability whereby the transaction could error
out if anyone had changed either the target table or any of the dependent
views since the transaction in which the tool captured their definitions.
But that would be a separate command not part of the ALTER.

What's more interesting is where the tool gets the draft modified view
definitions from, and how it can highlight exactly what the semantic
changes are for the user's benefit.  There would likely be value in
adding backend capability for parsing/reverse-listing views against
hypothetical table definitions, but I'm not sure about details.

I concur with Andres' thought that storing original view text doesn't
actually help here.  Rather, what we might need is a way to change
what the reverse-lister does with a view.  The behavior of ruleutils.c
is pretty well focused on what pg_dump needs, and that may not always
be what we want for this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 FWIW, I don't think reparsing the original view-text is even remotely
 plausible.  The fact that views stay glued to the same objects even of
 those objects are renamed is a pretty handy property of the current
 system, and any sort of reparse-from-scratch technique would give that
 up.

Agreed.

 Deparse-and-reparse might be better, but I'll bet that has too
 many problems to be viable, too (even if I haven't yet thought of what
 they are).  For better or for worse, I think the best we're likely to
 be able to do is somehow manipulate the already-parsed rewrite rule.
 I don't have any great ideas about how to do that, either, but it
 seems less problematic than going back to the SQL representation.

I think deparse-and-reparse is exactly what we have to do, mainly because,
if you subscribe to the idea that the user should see and approve semantic
changes, what else are we going to show her except SQL?  If she wants to
adjust the changes, it's even less plausible that the working
representation is not SQL text.  We might well produce the initial draft
form by manipulating the parsed querytree before deparsing, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Clustering in MADlib - status update

2014-06-02 Thread Hai Qian
I like the second option for refactoring the code. I think it is doable.

And where is your code on Github?

Hai

--
*Pivotal http://www.gopivotal.com/*
A new platform for a new era


On Sun, Jun 1, 2014 at 1:06 PM, Maxence Ahlouche maxence.ahlou...@gmail.com
 wrote:

 Hi all!

 I've pushed my report for this week on my repo [0]. Here is a copy!
 Attached is the patch containing my work for this week.
 Week 2 - 2014/01/01

 This week, I have worked on the beginning of the kmedoids module.
 Unfortunately, I was supposed to have something working for last Wednesday,
 and it is still not ready, mostly because I've lost time this week by being
 sick, and by packing all my stuff in preparation for relocation.

 The good news now: this week is my last school (exam) week, and that means
 full-time GSoC starting next Monday! Also, I've studied the kmeans module
 quite thoroughly, and I can finally understand how it all goes on, at the
 exception of one bit: the enormous SQL request used to update the
 IterationController.

 For kmedoids, I've abandoned the idea of making the loop by myself and
 have decided instead to stick to copying kmeans as much as possible, as it
 seems easier than doing it all by myself. The only part that remains to be
 adapted is that big SQL query I haven't totally understood yet. I've asked
 the help of Atri, but surely the help of an experienced MADlib hacker would
 speed things up :) Atri and I would also like to deal with this through a
 voip meeting, to ease communication. If anyone wants to join, you're
 welcome!

 As for the technology we'll use, I have a Mumble server running somewhere,
 if that fits to everyone. Otherwise, suggest something!

 I am available from Monday 2 at 3 p.m. (UTC) to Wednesday 4 at 10 a.m.
 (exam weeks are quite light).

 This week, I have also faced the first design decisions I have to make.
 For kmedoids, the centroids are points of the dataset. So, if I wanted to
 identify them precisely, I'd need to use their ids, but that would mean
 having a prototype different than the kmeans one. So, for now, I've decided
 to use the points coordinates only, hoping I will not run into trouble. If
 I ever do, switching to ids should'nt be too hard. Also, if the user wants
 to input initial medoids, he can input whatever points he wants, be they
 part of the dataset or not. After the first iteration, the centroids will
 anyway be points of the dataset (maybe I could just select the points
 nearest to the coordinates they input as initial centroids).

 Second, I'll need to refactor the code in kmeans and kmedoids, as these
 two modules are very similar. There are several options for this:

1. One big clustering module containing everything
clustering-related (ugly but easy option);
2. A clustering module and kmeans, kmedoids, optics, utils
submodules (the best imo, but I'm not sure it's doable);
3. A clustering_utils module at the same level as the others (less
ugly than the first one, but easy too).

 Any opinions?

 Next week, I'll get a working kmedoids module, do some refactoring, and
 then add the extra methods, similar to what's done in kmeans, for the
 different seedings. Once that's done, I'll make it compatible with all
 three ports (I'm currently producing Postgres-only code, as it's the
 easiest for me to test), and write the tests and doc. The deadline for this
 last step is in two weeks; I don't know yet if I'll be on time by then or
 not. It will depend on how fast I can get kmedoids working, and how fast
 I'll go once I'm full time GSoC.

 Finally, don't hesitate to tell me if you think my decisions are wrong,
 I'm glad to learn :)
 [0] http://git.viod.eu/viod/gsoc_2014/blob/master/reports.rst


 --
 Maxence Ahlouche
 06 06 66 97 00



Re: [HACKERS] Proposing pg_hibernate

2014-06-02 Thread Gurjeet Singh
On Fri, May 30, 2014 at 5:33 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 On Tue, May 27, 2014 at 10:01 PM, Gurjeet Singh gurj...@singh.im wrote:

 When the Postgres server is being stopped/shut down, the `Buffer
 Saver` scans the
 shared-buffers of Postgres, and stores the unique block identifiers of
 each cached
 block to the disk. This information is saved under the 
 `$PGDATA/pg_hibernator/`
 directory. For each of the database whose blocks are resident in shared 
 buffers,
 one file is created; for eg.: `$PGDATA/pg_hibernator/2.postgres.save`.

 This file-naming convention seems a bit fragile. For example, on my
 filesystem (HFS) if I create a database named foo / bar, I'll get a
 complaint like:

 ERROR:  could not open pg_hibernator/5.foo / bar.save: No such file
 or directory

 during shutdown.

Thanks for the report. I have reworked the file naming, and now the
save-file name is simply 'integer.save', so the name of a database
does not affect the file name on disk. Instead, the null-terminated
database name is now written in the file, and read back for use when
restoring the buffers.

Attached is the new version of pg_hibernator, with updated code and README.

Just a heads up for anyone who might have read/reviewed previous
version's code, there's some unrelated trivial code and Makefile
changes as well in this version, which can be easily spotted by a
`diff -r`.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


pg_hibernator.v2.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-02 Thread Michael Paquier
On Mon, Jun 2, 2014 at 11:07 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-02 22:59:55 +0900, Fujii Masao wrote:
 On Thu, May 29, 2014 at 3:02 PM, Peter Geoghegan p...@heroku.com wrote:
  On Wed, May 28, 2014 at 10:49 PM, Fujii Masao masao.fu...@gmail.com 
  wrote:
  You're concerned about the scenario using pg_upgrade? I'm not sure the 
  detail
  of pg_upgrade. But if it doesn't work properly, we should have gotten
  the trouble
 
  I'm not worried about pg_upgrade, because by design pg_stat_statements
  will discard stats files that originated in earlier versions. However,
  I don't see a need to change pg_stat_statements to serialize its
  statistics to disk in the pg_stat directory before we branch off 9.4.
  As you mentioned, it's harmless.

 Yeah, that's an idea. OTOH, there is no *strong* reason to postpone
 the fix to 9.5. So I just feel inclined to apply the fix now...

 +1 for fixing it now.
+1. A beta is here for that as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)

2014-06-02 Thread Tom Lane
I wrote:
 Andrew Dunstan and...@dunslane.net writes:
 Let's just stick to ASCII.

 The more I think about it, the more I think that using a plain-ASCII
 character would defeat most of the purpose of the test.  Non-breaking
 space seems like the best bet here, not least because it has several
 different representations among the encodings we support.

I've confirmed that a patch as attached behaves per expectation
(in particular, it passes with WIN1250 database encoding).

I think that the worry I expressed about UTF8 characters in expected-files
is probably overblown: we have such in collate.linux.utf8 test, and we've
not heard reports of that one breaking.  (Though of course, it's not run
by default :-(.)  It's still annoying that the test would fail in EUC_xx
encodings, but I see no way around that without largely lobotomizing the
test.

So I propose to apply this, and back-patch to 9.1 (not 9.0, because its
version of this test is different anyway --- so Tomas will have to drop
testing cs_CZ.WIN-1250 in 9.0).

regards, tom lane

diff --git a/src/pl/plpython/expected/plpython_unicode.out b/src/pl/plpython/expected/plpython_unicode.out
index 859edbb..c7546dd 100644
*** a/src/pl/plpython/expected/plpython_unicode.out
--- b/src/pl/plpython/expected/plpython_unicode.out
***
*** 1,22 
  --
  -- Unicode handling
  --
  SET client_encoding TO UTF8;
  CREATE TABLE unicode_test (
  	testvalue  text NOT NULL
  );
  CREATE FUNCTION unicode_return() RETURNS text AS E'
! return u\\x80
  ' LANGUAGE plpythonu;
  CREATE FUNCTION unicode_trigger() RETURNS trigger AS E'
! TD[new][testvalue] = u\\x80
  return MODIFY
  ' LANGUAGE plpythonu;
  CREATE TRIGGER unicode_test_bi BEFORE INSERT ON unicode_test
FOR EACH ROW EXECUTE PROCEDURE unicode_trigger();
  CREATE FUNCTION unicode_plan1() RETURNS text AS E'
  plan = plpy.prepare(SELECT $1 AS testvalue, [text])
! rv = plpy.execute(plan, [u\\x80], 1)
  return rv[0][testvalue]
  ' LANGUAGE plpythonu;
  CREATE FUNCTION unicode_plan2() RETURNS text AS E'
--- 1,27 
  --
  -- Unicode handling
  --
+ -- Note: this test case is known to fail if the database encoding is
+ -- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
+ -- U+00A0 (no-break space) in those encodings.  However, testing with
+ -- plain ASCII data would be rather useless, so we must live with that.
+ --
  SET client_encoding TO UTF8;
  CREATE TABLE unicode_test (
  	testvalue  text NOT NULL
  );
  CREATE FUNCTION unicode_return() RETURNS text AS E'
! return u\\xA0
  ' LANGUAGE plpythonu;
  CREATE FUNCTION unicode_trigger() RETURNS trigger AS E'
! TD[new][testvalue] = u\\xA0
  return MODIFY
  ' LANGUAGE plpythonu;
  CREATE TRIGGER unicode_test_bi BEFORE INSERT ON unicode_test
FOR EACH ROW EXECUTE PROCEDURE unicode_trigger();
  CREATE FUNCTION unicode_plan1() RETURNS text AS E'
  plan = plpy.prepare(SELECT $1 AS testvalue, [text])
! rv = plpy.execute(plan, [u\\xA0], 1)
  return rv[0][testvalue]
  ' LANGUAGE plpythonu;
  CREATE FUNCTION unicode_plan2() RETURNS text AS E'
*** return rv[0][testvalue]
*** 27,46 
  SELECT unicode_return();
   unicode_return 
  
!  \u0080
  (1 row)
  
  INSERT INTO unicode_test (testvalue) VALUES ('test');
  SELECT * FROM unicode_test;
   testvalue 
  ---
!  \u0080
  (1 row)
  
  SELECT unicode_plan1();
   unicode_plan1 
  ---
!  \u0080
  (1 row)
  
  SELECT unicode_plan2();
--- 32,51 
  SELECT unicode_return();
   unicode_return 
  
!   
  (1 row)
  
  INSERT INTO unicode_test (testvalue) VALUES ('test');
  SELECT * FROM unicode_test;
   testvalue 
  ---
!   
  (1 row)
  
  SELECT unicode_plan1();
   unicode_plan1 
  ---
!   
  (1 row)
  
  SELECT unicode_plan2();
diff --git a/src/pl/plpython/sql/plpython_unicode.sql b/src/pl/plpython/sql/plpython_unicode.sql
index bdd40c4..a11e5ee 100644
*** a/src/pl/plpython/sql/plpython_unicode.sql
--- b/src/pl/plpython/sql/plpython_unicode.sql
***
*** 1,6 
--- 1,11 
  --
  -- Unicode handling
  --
+ -- Note: this test case is known to fail if the database encoding is
+ -- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
+ -- U+00A0 (no-break space) in those encodings.  However, testing with
+ -- plain ASCII data would be rather useless, so we must live with that.
+ --
  
  SET client_encoding TO UTF8;
  
*** CREATE TABLE unicode_test (
*** 9,19 
  );
  
  CREATE FUNCTION unicode_return() RETURNS text AS E'
! return u\\x80
  ' LANGUAGE plpythonu;
  
  CREATE FUNCTION unicode_trigger() RETURNS trigger AS E'
! TD[new][testvalue] = u\\x80
  return MODIFY
  ' LANGUAGE plpythonu;
  
--- 14,24 
  );
  
  CREATE FUNCTION unicode_return() RETURNS text AS E'
! return u\\xA0
  ' LANGUAGE plpythonu;
  
  CREATE FUNCTION unicode_trigger() RETURNS trigger AS E'
! TD[new][testvalue] = u\\xA0
  return MODIFY
  ' LANGUAGE plpythonu;
  
*** 

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Robert Haas
On Mon, Jun 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Deparse-and-reparse might be better, but I'll bet that has too
 many problems to be viable, too (even if I haven't yet thought of what
 they are).  For better or for worse, I think the best we're likely to
 be able to do is somehow manipulate the already-parsed rewrite rule.
 I don't have any great ideas about how to do that, either, but it
 seems less problematic than going back to the SQL representation.

 I think deparse-and-reparse is exactly what we have to do, mainly because,
 if you subscribe to the idea that the user should see and approve semantic
 changes, what else are we going to show her except SQL?  If she wants to
 adjust the changes, it's even less plausible that the working
 representation is not SQL text.  We might well produce the initial draft
 form by manipulating the parsed querytree before deparsing, though.

So I think the scenario we're talking about, simplified down to
basics, is something like this:

CREATE TABLE foo (a int);
CREATE VIEW bar AS SELECT a FROM foo;
ALTER TABLE foo ALTER COLUMN a SET DATA TYPE bigint;

If we wanted to make that last statement succeed instead of failing,
what would we want it to do?  I can see two answers.  Answer #1 is
that the column type of bar.a changes from int to bigint and the view
definition is still SELECT a FROM foo.  In that case, showing the user
the SQL does not help them see and approve semantic changes because
the SQL is completely unchanged.  Answer #2 is that the column type of
bar.a remains int4 and therefore the view definition mutates to
something like SELECT a::int4 AS a FROM foo.  In that case, showing
the user the SQL does help the user understand what is happening ...
but, as you say, you'd probably generate the new parse tree by
manipulating the existing stored rule.  And if you then deparsed it,
how would that help?  It's not like you can dump out the revised view
definition and let the user edit it and put it back in.  The view has
to get modified as part of the same action as changing the table's
column type, or you can't do anything we can't do already.  Frankly, I
don't think showing that particular thing to the user is necessary
anyway; it's not like the semantics of pushing a cast on top of every
use of the column within a related view are particularly hard to
understand.  And, anyway, whatever we do here has to be simple to
invoke or we lose most of the advantage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think deparse-and-reparse is exactly what we have to do, mainly because,
 if you subscribe to the idea that the user should see and approve semantic
 changes, what else are we going to show her except SQL?  If she wants to
 adjust the changes, it's even less plausible that the working
 representation is not SQL text.  We might well produce the initial draft
 form by manipulating the parsed querytree before deparsing, though.

 So I think the scenario we're talking about, simplified down to
 basics, is something like this:

 CREATE TABLE foo (a int);
 CREATE VIEW bar AS SELECT a FROM foo;
 ALTER TABLE foo ALTER COLUMN a SET DATA TYPE bigint;

 If we wanted to make that last statement succeed instead of failing,
 what would we want it to do?

My argument is that that command sequence, if issued exactly like that,
SHOULD fail.  It is not the backend's task to fix this case, and any
smarts you try to put into ALTER TABLE to make it work are certain
to do the wrong thing a distressingly high percentage of the time.
Rather, it should be possible to build a client-side tool that can help
users with such changes.

 I can see two answers.  Answer #1 is
 that the column type of bar.a changes from int to bigint and the view
 definition is still SELECT a FROM foo.  In that case, showing the user
 the SQL does not help them see and approve semantic changes because
 the SQL is completely unchanged.

Yeah, we need some way of highlighting the semantic differences, and just
printing ruleutils.c output doesn't do that.  But if the user is going to
put in a change to whatever choice the tool makes by default here,
I would expect that change to consist of adding (or removing) an explicit
cast in the SQL-text view definition.  We can't make people learn some
random non-SQL notation for this.

Perhaps the displayed output of the tool could look something like

CREATE VIEW bar AS
  SELECT
a  -- this view output column will now be of type int8 not int4
  FROM foo;

Or something else; I don't claim to be a good UI designer.  But in the
end, this is 90% a UI problem, and that means that raw SQL is seriously
poorly suited to solve it directly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-02 Thread Haribabu Kommi
On Thu, May 29, 2014 at 3:28 AM, Keith Fiske ke...@omniti.com wrote:
 On Sun, Mar 23, 2014 at 10:42 PM, Haribabu Kommi kommi.harib...@gmail.com
 wrote:

 On Fri, Mar 21, 2014 at 5:17 PM,  dar...@dons.net.au wrote:
  The following bug has been logged on the website:
  reclog= select * from foo;
 bar
  -
   1.2.3.4
  (1 row)
 
  reclog= select min(bar) from foo;
  ERROR:  function min(inet) does not exist
  LINE 1: select min(bar) from foo;
 ^
  HINT:  No function matches the given name and argument types. You might
  need
  to add explicit type casts.
 
  This is surprising to me since the inet type is ordered, hence min/max
  are
  possible.

 In the code, some comparison logic for the inet datatypes already present.
 With those I thought it is easy to support the min and max aggregates
 also.
 So I modified the code to support the aggregate functions of min and
 max by using
 the already existing inet comparison logic. Is there anything I am
 missing?

 postgres=# create table tbl(f inet);
 CREATE TABLE
 postgres=# insert into tbl values('1.2.3.5');
 INSERT 0 1
 postgres=# insert into tbl values('1.2.3.4');
 INSERT 0 1
 postgres=# select min(f) from tbl;
min
 -
  1.2.3.4
 (1 row)

 postgres=# select max(f) from tbl;
max
 -
  1.2.3.5
 (1 row)

 Patch is attached.

 This is my first time reviewing a patch, so apologies if I've missed
 something in the process.

 I tried applying your patch to the current git HEAD as of 2014-05-27 and the
 portion against pg_aggregate.h fails

 postgres $ patch -Np1 -i ../inet_agg.patch
 patching file src/backend/utils/adt/network.c
 Hunk #1 succeeded at 471 (offset -49 lines).
 patching file src/include/catalog/pg_aggregate.h
 Hunk #1 FAILED at 140.
 Hunk #2 FAILED at 162.
 2 out of 2 hunks FAILED -- saving rejects to file
 src/include/catalog/pg_aggregate.h.rej
 patching file src/include/catalog/pg_proc.h
 Hunk #1 succeeded at 2120 (offset 8 lines).
 Hunk #2 succeeded at 3163 (offset 47 lines).
 Hunk #3 succeeded at 3204 (offset 47 lines).
 patching file src/include/utils/builtins.h
 Hunk #1 succeeded at 907 (offset 10 lines).

 Looks like starting around April 12th some additional changes were made to
 the DATA lines in this file that have to be accounted for.

 https://github.com/postgres/postgres/commits/master/src/include/catalog/pg_aggregate.h

 Don't have the knowledge of how to fix this for the new format, but thought
 I'd at least try to apply the patch and see if it works.

Thanks for verifying the patch. Please find the re-based patch
attached in the mail.

Regards,
Hari Babu
Fujitsu Australia


inet_agg_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spreading full-page writes

2014-06-02 Thread Amit Kapila
On Mon, Jun 2, 2014 at 6:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, May 28, 2014 at 1:10 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

  IIUC in DBW mechanism, we need to have a temporary sequential
  log file of fixed size which will be used to write data before the data
  gets written to its actual location in tablespace.  Now as the temporary
  log file is of fixed size, the number of pages that needs to be read
  during recovery should be less as compare to FPW because in FPW
  it needs to read all the pages written in WAL log after last successful
  checkpoint.

 Hmm... maybe I'm misunderstanding how WAL replay works in DBW case.
 Imagine the case where we try to replay two WAL records for the page A and
 the page has not been cached in shared_buffers yet. If FPW is enabled,
 the first WAL record is FPW and firstly it's just read to shared_buffers.
 The page doesn't neeed to be read from the disk. Then the second WAL
record
 will be applied.

 OTOH, in DBW case, how does this example case work? I was thinking that
 firstly we try to apply the first WAL record but find that the page A
doesn't
 exist in shared_buffers yet. We try to read the page from the disk, check
 whether its CRC is valid or not, and read the same page from double buffer
 if it's invalid. After reading the page into shared_buffers, the first WAL
 record can be applied. Then the second WAL record will be applied. Is my
 understanding right?

I think the way DBW works is that before reading WAL, it first makes
data pages consistent.  It will first check the doublewrite buffer
contents and pages in their original location.  If page is inconsistent
in double write buffer it is simply discarded, if it is inconsistent in
the tablespace it is recovered from double write buffer.  After reaching
the double buffer end, it will start reading WAL.

So in above example case, it will read the first record from WAL
and check if page is already in shared_buffers, then apply WAL
change, else read the page into shared_buffers, then apply WAL.
For second record, it doesn't need to read the page.

The saving during recovery will come from the fact that in case
of DBW, it will not read the FPI from WAL, rather just 2 records
(it has to read a WAL page, but that will contain many records).
So it seems to be a net win.

Now incase of DBW, the extra workdone (reading the double buffer,
checking the consistency of same with actual page) is always fixed
as size of double buffer is fixed, so the impact due to it should
be much less than reading FPI's from WAL after last successful
checkpoint.

If my above understanding is right, then performance of recovery
should be better with DBW in most cases.

I think the cases where DBW might need to take care is when
there are lot of backend evictions.  For such scenario's backend
might itself need to write both to double buffer and actual page.
It can have more impact during bulk reads (when it has to set hint
bit) and Vacuum which gets performed in ring buffer.

One of the improvement that can be done here is to change the buffer
eviction algorithm such that it can give up the buffer which needs
to be written to double buffer.  There can be other improvements as
well depending on DBW implementation.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com