Re: [HACKERS] Oid registry

2012-09-27 Thread Daniel Farina
On Tue, Sep 25, 2012 at 5:36 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Mon, 2012-09-24 at 21:18 -0700, Daniel Farina wrote:
 The gap between
 pre-JSON-in-the-standard-library in Python, Ruby, et al and
 post-JSON-in-stdlib was much smaller.

 Except in Python they renamed the thing.

By 'smaller' I mean the difference in capability between promoted to
the Python stdlib is comparatively smaller than data types and
operators from being promoted to core in Postgres, and the main reason
for that is that people can compose on top of Postgres-core
functionality, but those extensions themselves are not very useful in
further composition.  The extensibility at the first level is great.

There exists an entirely livable alternative-future/present where
Python never promoted the (simple)json library into the stdlib.  The
same could not be said for Postgres extensions as they exist today.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-09-27 Thread Hannu Krosing

On 09/26/2012 01:02 AM, m...@rpzdesign.com wrote:

John:

Who has the money for oracle RAC or funding arrogant bastard Oracle 
CEO Ellison to purchase another island?


Postgres needs CHEAP, easy to setup, self healing, 
master-master-master-master and it needs it yesterday.


I was able to patch the 9.2.0 code base in 1 day and change my entire 
architecture strategy for replication
into self healing async master-master-master and the tiniest bit of 
sharding code imaginable

Tell us about the compromises you had to make.

It is an established fact that you can either have it replicate fast and 
loose or slow and correct.


In the fast and loose case you have to be ready to do a lot of 
mopping-up in case of conflicts.
That is why I suggest something to replace OIDs with ROIDs for 
replication ID.  (CREATE TABLE with ROIDS)

I implement ROIDs as a uniform design pattern for the table structures.

Synchronous replication maybe between 2 local machines if absolutely 
no local
hardware failure is acceptable, but cheap, scaleable synchronous, 

Scaleable / synchronous is probably doable, if we are ready to take the
initial performance hit of lock propagation.


TRANSACTIONAL, master-master-master-master is a real tough slog.

I could implement global locks in the external replication layer if I 
choose, but there are much easier ways in routing
requests thru the load balancer and request sharding than trying to 
manage global locks across the WAN.


Good luck with your HA patch for Postgres.

Thanks for all of the responses!

You guys are 15 times more active than the MySQL developer group, 
likely because
they do not have a single db engine that meets all the requirements 
like PG.


marco

On 9/25/2012 5:10 PM, John R Pierce wrote:

On 09/25/12 11:01 AM, m...@rpzdesign.com wrote:



At some point, every master - slave replicator gets to the point 
where they need
to start thinking about master-master replication. 


master-master and transactional integrity are mutually exclusive, 
except perhaps in special cases like Oracle RAC, where the masters 
share a coherent cache and implement global locks.














--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EVENT Keyword and CREATE TABLE

2012-09-27 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 It's a bug.  The event-trigger patch added EVENT as a new keyword, but
 forgot to list it in the unreserved_keywords production, which is
 necessary to make it actually act unreserved.

Oh. Oops. Sorry about that.

 I've committed a fix, and manually verified there are no other such
 errors at present, but this isn't the first time this has happened.
 We probably need to put in some automated cross-check, or it won't
 be the last time either.

I see you've added the automated cross-check now, thanks for that!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum stress-testing our system

2012-09-27 Thread Simon Riggs
On 26 September 2012 15:47, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 Really, as far as autovacuum is concerned, it would be much more useful
 to be able to reliably detect that a table has been recently vacuumed,
 without having to request a 10ms-recent pgstat snapshot.  That would
 greatly reduce the amount of time autovac spends on pgstat requests.

VACUUMing generates a relcache invalidation. Can we arrange for those
invalidations to be received by autovac launcher, so it gets immediate
feedback of recent activity without polling?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation

2012-09-27 Thread Heikki Linnakangas

On 25.09.2012 18:27, Amit Kapila wrote:

If you feel it is must to do the comparison, we can do it in same way as we
identify for HOT?


Yeah. (But as discussed, I think it would be even better to just treat 
the old and new tuple as an opaque chunk of bytes, and run them through 
a generic delta algorithm).



   Can you please explain me why you think that after doing encoding doing LZ
compression on it is better, as already we have reduced the amount of WAL
for update by only storing changed column information?

a. is it to further reduce the size of WAL
b. storing diff WAL in some standard format
c. or does it give any other kind of benefit


Potentially all of those. I don't know if it'd be better or worse, but 
my gut feeling is that it would be simpler, and produce even more 
compact WAL.


Attached is a simple patch to apply LZ compression to update WAL 
records. I modified the LZ compressor so that it can optionally use a 
separate history data, and the same history data must then be passed 
to the decompression function. That makes it work as a pretty efficient 
delta encoder, when you use the old tuple as the history data.


I ran some performance tests with the modified version of pgbench that 
you posted earlier:


Current PostgreSQL master
-

tps = 941.601924 (excluding connections establishing)
 pg_xlog_location_diff
---
 721227944

pglz_wal_update_records.patch
-

tps = 1039.792527 (excluding connections establishing)
 pg_xlog_location_diff
---
 419395208

pglz_wal_update_records.patch, COMPRESS_ONLY


tps = 1009.682002 (excluding connections establishing)
 pg_xlog_location_diff
---
 422505104


Amit's wal_update_changes_hot_update.patch
--

tps = 1092.703883 (excluding connections establishing)
 pg_xlog_location_diff
---
 436031544


The COMPRESS_ONLY result is with the attached patch, but it just uses LZ 
to compress the new tuple, without taking advantage of the old tuple. 
The pg_xlog_location_diff value is the amount of WAL generated during 
the pgbench run. Attached is also the shell script I used to run these 
tests.


The conclusion is that there isn't very much difference among the 
patches. They all squeeze the WAL to about the same size, and the 
increase in TPS is roughly the same.


I think more performance testing is required. The modified pgbench test 
isn't necessarily very representative of a real-life application. The 
gain (or loss) of this patch is going to depend a lot on how many 
columns are updated, and in what ways. Need to test more scenarios, with 
many different database schemas.


The LZ approach has the advantage that it can take advantage of all 
kinds of similarities between old and new tuple. For example, if you 
swap the values of two columns, LZ will encode that efficiently. Or if 
you insert a character in the middle of a long string. On the flipside, 
it's probably more expensive. Then again, you have to do a memcmp() to 
detect which columns have changed with your approach, and that's not 
free either. That was not yet included in the patch version I tested. 
Another consideration is that when you compress the record more, you 
have less data to calculate CRC for. CRC calculation tends to be quite 
expensive, so even quite aggressive compression might be a win. Yet 
another consideration is that the compression/encoding is done while 
holding a lock on the buffer. For the sake of concurrency, you want to 
keep the duration the lock is held as short as possible.


- Heikki
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 5a4591e..56b53a5 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -70,6 +70,7 @@
 #include utils/snapmgr.h
 #include utils/syscache.h
 #include utils/tqual.h
+#include utils/pg_lzcompress.h
 
 
 /* GUC variable */
@@ -85,6 +86,7 @@ static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
 	TransactionId xid, CommandId cid, int options);
 static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
 ItemPointerData from, Buffer newbuf, HeapTuple newtup,
+HeapTuple oldtup,
 bool all_visible_cleared, bool new_all_visible_cleared);
 static bool HeapSatisfiesHOTUpdate(Relation relation, Bitmapset *hot_attrs,
 	   HeapTuple oldtup, HeapTuple newtup);
@@ -3195,10 +3197,12 @@ l2:
 	/* XLOG stuff */
 	if (RelationNeedsWAL(relation))
 	{
-		XLogRecPtr	recptr = log_heap_update(relation, buffer, oldtup.t_self,
-			 newbuf, heaptup,
-			 all_visible_cleared,
-			 all_visible_cleared_new);
+		XLogRecPtr	recptr;
+
+		recptr = log_heap_update(relation, buffer, oldtup.t_self,
+ newbuf, heaptup, oldtup,
+ all_visible_cleared,
+ 

Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation

2012-09-27 Thread Amit Kapila
 On Thursday, September 27, 2012 4:12 PM Heikki Linnakangas wrote:
 On 25.09.2012 18:27, Amit Kapila wrote:
  If you feel it is must to do the comparison, we can do it in same way
  as we identify for HOT?
 
 Yeah. (But as discussed, I think it would be even better to just treat
 the old and new tuple as an opaque chunk of bytes, and run them through
 a generic delta algorithm).
 

Thank you for the modified patch.
 
 The conclusion is that there isn't very much difference among the
 patches. They all squeeze the WAL to about the same size, and the
 increase in TPS is roughly the same.
 
 I think more performance testing is required. The modified pgbench test
 isn't necessarily very representative of a real-life application. The
 gain (or loss) of this patch is going to depend a lot on how many
 columns are updated, and in what ways. Need to test more scenarios,
 with many different database schemas.
 
 The LZ approach has the advantage that it can take advantage of all
 kinds of similarities between old and new tuple. For example, if you
 swap the values of two columns, LZ will encode that efficiently. Or if
 you insert a character in the middle of a long string. On the flipside,
 it's probably more expensive. Then again, you have to do a memcmp() to
 detect which columns have changed with your approach, and that's not
 free either. That was not yet included in the patch version I tested.
 Another consideration is that when you compress the record more, you
 have less data to calculate CRC for. CRC calculation tends to be quite
 expensive, so even quite aggressive compression might be a win. Yet
 another consideration is that the compression/encoding is done while
 holding a lock on the buffer. For the sake of concurrency, you want to
 keep the duration the lock is held as short as possible.

Now I shall do the various tests for following and post it here:
a. Attached Patch in the mode where it takes advantage of history tuple
b. By changing the logic for modified column calculation to use calculation
for memcmp()


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Robert Haas
On Wed, Sep 26, 2012 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, on reflection I'm not sure about commandeering cast-to-json for
 this --- aren't we really casting to json member or something like
 that?  The distinction between a container and its contents seems
 important here.  With a container type as source, it might be important
 to do something different if we're coercing it to a complete JSON
 value versus something that will be just one member.  I'm handwaving
 here because I don't feel like going back to re-read the RFC, but
 it seems like something that should be considered carefully before
 we lock down an assumption that there can never be a difference.

I feel like there are two different behaviors that someone might want
here, and a cast cannot mean both.

1. Please reinterpret the existing value that I have already got as a
JSON object.  For example, you might have a text field in which you
have been storing JSON values.  Once you upgrade to 9.2, you might
want to reinterpret the existing contents of the field - which are
already valid JSON - as JSON objects.

2. Please convert the value that I have into a JSON object according
to a type-specific rule.  For example, you might have a text field in
which you store arbitrary strings.  But perhaps you need to store
structured data there, so once you upgrade to 9.2 you might want to
wrap up your strings inside JSON strings.

Now there is some subtle ambiguity here because in some cases the
behavior can be exactly the same in both cases.  For example most
numeric values will get the same treatment either way, but NaN cannot.
 If you do mynumeric::json, interpretation #1 will fail for NaN but
interpretation #2 will probably produce something like NaN.
Similarly if the type is boolean, we could likely get away with
producing true and false for either interpretation.  If the type is
hstore, then #1 is going to fail, but #2 is going to convert 1=2
to {1:2}.  So in general it might seem that #2 is the better
interpretation, because it gives many casts a sensible interpretation
that is otherwise lacking.

But, what about text?  It seems to me that users will count on the
fact that '[1,2,3]'::text::json is going to produce [1,2,3] (a JSON
array containing the first three numbers) and NOT [1,2,3] (a JSON
string containing 7 characters).  And that is emphatically
interpretation #1.   I think it would be an extremely bad idea to
decide that casts should have interpretation #2 for all data types
except things that are kind of like text, which should instead behave
like #1.  And if we standardize completely on interpretation #2, then
I think that '[1,2,3]'::json will end up meaning something different
from '[1,2,3]'::text::json, because the former will (IIUC) go through
the type-input function and end up creating a JSON array, whereas the
latter will go through the cast function and end up creating a JSON
string.  It would also mean that in more complex queries, you could
get substantially different behavior in simple cases where the parser
maintains literals as unknown vs. more complex cases where it decides
that they must be text for lack of a full type inference system.

Maybe I am being too pedantic about this and there is a way to make it
all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] Perhaps something to consider for a future extension of the standard.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system_information.triggers truncate triggers

2012-09-27 Thread Robert Haas
On Wed, Sep 26, 2012 at 10:59 AM, Christopher Browne cbbro...@gmail.com wrote:
 A different place where I wound up having to jump through considerable
 hoops when doing schema analytics was vis-a-vis identifying functions.
  I need to be able to compare schemas across databases, so oid-based
 identification of functions is a total non-starter.  It appears that
 the best identification of a function would be based on the
 combination of schema name, function name, and the concatenation of
 argument data types.  It wasn't terribly difficult to construct that
 third bit, but it surely would be nice if there was a view capturing
 it, and possibly even serializing it into a table to enable indexing
 on it.  Performance-wise, function comparisons turned out to be one of
 the most expensive things I did, specifically because of that mapping
 surrounding arguments.

pg_proc.oid::regprocedure::text has been pretty good to me for this
sort of thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-09-27 Thread Amit Kapila
 On Thursday, September 27, 2012 9:12 AM Noah Misch wrote:
 On Mon, Sep 24, 2012 at 10:57:02AM +, Amit kapila wrote:
  Rebased version of patch based on latest code.
 
 I like the direction you're taking with this patch; the gains are
 striking,
 especially considering the isolation of the changes.

  Thank you for a detailed review of the patch.

 You cannot assume executor-unmodified columns are also unmodified from
 heap_update()'s perspective.  Expansion in one column may instigate
 TOAST
 compression of a logically-unmodified column, and that counts as a
 change for
 xlog delta purposes.  You do currently skip the optimization for
 relations
 having a TOAST table, but TOAST compression can still apply.  Observe
 this
 with text columns of storage mode PLAIN.  I see two ways out: skip the
 new
 behavior when need_toast=true, or compare all inline column data, not
 just
 what the executor modified.  One can probably construct a benchmark
 favoring
 either choice.  I'd lean toward the latter; wide tuples are the kind
 this
 change can most help.  If the marginal advantage of ignoring known-
 unmodified
 columns proves important, we can always bring it back after designing a
 way to
 track which columns changed in the toaster.

You are right that it can give benefit for both ways, but we should also see
which approach can 
give better results for most of the scenario's. 
As in most cases of Update I have observed, the change in values will not
increase the length of value to too much.
OTOH I am not sure may be there are many more scenario's which change the
length of updated value which can lead to scenario explained by you above.

 
 Given that, why not treat the tuple as an opaque series of bytes and
 not worry
 about datum boundaries?  When several narrow columns change together,
 say a
 sequence of sixteen smallint columns, you will use fewer binary delta
 commands
 by representing the change with a single 32-byte substitution.  If an
 UPDATE
 changes just part of a long datum, the delta encoding algorithm will
 still be
 able to save considerable space.  That case arises in many forms:
 changing
 one word in a long string, changing one element in a long array,
 changing one
 field of a composite-typed column.  Granted, this makes the choice of
 delta
 encoding algorithm more important.
 
 Like Heikki, I'm left wondering why your custom delta encoding is
 preferable
 to an encoding from the literature.  Your encoding has much in common
 with
 VCDIFF, even sharing two exact command names.  If a custom encoding is
 the
 right thing, code comments or a README section should at least discuss
 the
 advantages over an established alternative.  Idle thought: it might pay
 off to
 use 1-byte sizes and offsets most of the time.  Tuples shorter than 256
 bytes
 are common; for longer tuples, we can afford wider offsets.

My apprehension was that it can affect the performance if do more work by
holding the lock. 
If we use any standard technique like LZ of VCDiff, it has overhead of
comparison
and other things pertaining to their algorithm. 
However using updated patch by Heikki, I can run the various performance
tests both for update operation as well as recovery.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Modest proposal: run check_keywords.pl on every build

2012-09-27 Thread Robert Haas
On Wed, Sep 26, 2012 at 9:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Any objections?

None, and my apologies for the oversight.

I would really like to have a cross-check for duplicated OIDs, too.
After a couple of years I think I have learned most of the rules for
what stuff I need to cross-check before committing, but it is a
nuisance to remember to do it each time and, despite my best efforts,
I have fouled things up a few times.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Merlin Moncure
On Thu, Sep 27, 2012 at 8:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 26, 2012 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, on reflection I'm not sure about commandeering cast-to-json for
 this --- aren't we really casting to json member or something like
 that?  The distinction between a container and its contents seems
 important here.  With a container type as source, it might be important
 to do something different if we're coercing it to a complete JSON
 value versus something that will be just one member.  I'm handwaving
 here because I don't feel like going back to re-read the RFC, but
 it seems like something that should be considered carefully before
 we lock down an assumption that there can never be a difference.

 I feel like there are two different behaviors that someone might want
 here, and a cast cannot mean both.

 1. Please reinterpret the existing value that I have already got as a
 JSON object.  For example, you might have a text field in which you
 have been storing JSON values.  Once you upgrade to 9.2, you might
 want to reinterpret the existing contents of the field - which are
 already valid JSON - as JSON objects.

 2. Please convert the value that I have into a JSON object according
 to a type-specific rule.  For example, you might have a text field in
 which you store arbitrary strings.  But perhaps you need to store
 structured data there, so once you upgrade to 9.2 you might want to
 wrap up your strings inside JSON strings.

 Now there is some subtle ambiguity here because in some cases the
 behavior can be exactly the same in both cases.  For example most
 numeric values will get the same treatment either way, but NaN cannot.
  If you do mynumeric::json, interpretation #1 will fail for NaN but
 interpretation #2 will probably produce something like NaN.
 Similarly if the type is boolean, we could likely get away with
 producing true and false for either interpretation.  If the type is
 hstore, then #1 is going to fail, but #2 is going to convert 1=2
 to {1:2}.  So in general it might seem that #2 is the better
 interpretation, because it gives many casts a sensible interpretation
 that is otherwise lacking.

 But, what about text?  It seems to me that users will count on the
 fact that '[1,2,3]'::text::json is going to produce [1,2,3] (a JSON
 array containing the first three numbers) and NOT [1,2,3] (a JSON
 string containing 7 characters).  And that is emphatically
 interpretation #1.

Hm.  Well, that's a really good point although I kinda disagree with
your assumption: I think it's much cleaner to have:
select  '[1,2,3]'::int[]::json
produce a json array.

All types but text (record[] etc) would seem to use the type structure
to define how the json gets laid out.  'text::json' is an exception,
because there is an implied parse, which I'm starting to unfortunately
think is the wrong behavior if you want to be able to make json datums
out of sql datums: how do you create a vanilla json text datum?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-09-27 Thread Alvaro Herrera
Excerpts from Kohei KaiGai's message of jue sep 27 01:06:41 -0300 2012:
 Hi Alvaro,
 
 Let me volunteer for reviewing, of course, but now pgsql_fdw is in my queue...

Sure, thanks -- keep in mind I entered this patch in the next
commitfest, so please do invest more effort in the ones in the
commitfest now in progress.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Andrew Dunstan


On 09/27/2012 09:22 AM, Robert Haas wrote:

On Wed, Sep 26, 2012 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Also, on reflection I'm not sure about commandeering cast-to-json for
this --- aren't we really casting to json member or something like
that?  The distinction between a container and its contents seems
important here.  With a container type as source, it might be important
to do something different if we're coercing it to a complete JSON
value versus something that will be just one member.  I'm handwaving
here because I don't feel like going back to re-read the RFC, but
it seems like something that should be considered carefully before
we lock down an assumption that there can never be a difference.

I feel like there are two different behaviors that someone might want
here, and a cast cannot mean both.

[snip]


Maybe I am being too pedantic about this and there is a way to make it
all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.



OK. I think that's a very good point. I guess I was kinda swept away by 
this being suggested by a couple of influential people.


[pause for another bout of vigorous self-criticism]

So how about this suggestion: we'll look for a visible function named 
as_json or some such which has one parameter of the given type and 
returns json, and if it's present use it instead of the standard text 
representation. As an optimization we'll skip that lookup for builtin 
types, since there won't be one. Of course, we'll have one for hstore.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/27/2012 09:22 AM, Robert Haas wrote:
 Maybe I am being too pedantic about this and there is a way to make it
 all work nicely, but it sure feels like using the casting machinery
 here is blending together two different concepts that are only
 sometimes the same.

 OK. I think that's a very good point. I guess I was kinda swept away by 
 this being suggested by a couple of influential people.

Well, that doesn't make it wrong, it just means there's more work
needed.  I'm not that thrilled with magic assumptions about function
names either; schema search path issues, for example, will make that
dangerous.  We've gone to considerable lengths to avoid embedding
assumptions about operator names, and assumptions about function names
aren't any better.

There are at least three ways we could use the cast machinery for this:

(1) Reject Robert's assumption that we have to support both
interpretations for every cast situation.  For instance, it doesn't
seem that unreasonable to me to insist that you have to cast to text
and then to json if you want the literal-reinterpretation behavior.
The main problem then is figuring out a convenient way to provide
interpretation #2 for text itself.

(2) Add another hidden argument to cast functions, or perhaps repurpose
one of the ones that exist now.  This is likely to come out rather ugly
because of the need to shoehorn it into an API that's already suffered
a couple of rounds of after-the-fact additions, but it's certainly
possible in principle.  The main thing I'd want is to not define it
in a JSON-only fashion --- so the first thing is to be able to explain
the distinction we're trying to make in a type-independent way.

(3) Invent an auxiliary type along the lines of json_value and say
that you create a cast from foo to json_value when you want one
interpretation, or directly to json if you want the other.  Then
things like record_to_json would look for the appropriate type of cast.
This is a bit ugly because the auxiliary type has no reason to live
other than to separate the two kinds of cast, but it avoids creating
any new JSON-specific mechanisms in the type system.

There might be some other ideas I'm not thinking of.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Robert Haas
On Thu, Sep 27, 2012 at 10:09 AM, Andrew Dunstan and...@dunslane.net wrote:
 So how about this suggestion: we'll look for a visible function named
 as_json or some such which has one parameter of the given type and returns
 json, and if it's present use it instead of the standard text
 representation. As an optimization we'll skip that lookup for builtin types,
 since there won't be one. Of course, we'll have one for hstore.

I think that general approach has some promise, although there are
some security issues to worry about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Misa Simic
Hm...

IMO,

'[1,2,3]'::json
'[1,2,3]'::text::json
'[1,2,3]'::int[]::json

are the same thing... (though I am not sure '[1,2,3]'::int[] is valid in
postgres...)

in js var o = JSON.parse(result_of_any_cast_above) should produce array of
3 integer

'[1,2,3]' is different then'[1,2,3]'

If there is the need to some text value as '[1,2,3]' be treated as JSON
text value, then it would be: quote_literal('[1,2,3]')::json

Kind Regards,

Misa
 ||




2012/9/27 Merlin Moncure mmonc...@gmail.com

 On Thu, Sep 27, 2012 at 8:22 AM, Robert Haas robertmh...@gmail.com
 wrote:
  On Wed, Sep 26, 2012 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Also, on reflection I'm not sure about commandeering cast-to-json for
  this --- aren't we really casting to json member or something like
  that?  The distinction between a container and its contents seems
  important here.  With a container type as source, it might be important
  to do something different if we're coercing it to a complete JSON
  value versus something that will be just one member.  I'm handwaving
  here because I don't feel like going back to re-read the RFC, but
  it seems like something that should be considered carefully before
  we lock down an assumption that there can never be a difference.
 
  I feel like there are two different behaviors that someone might want
  here, and a cast cannot mean both.
 
  1. Please reinterpret the existing value that I have already got as a
  JSON object.  For example, you might have a text field in which you
  have been storing JSON values.  Once you upgrade to 9.2, you might
  want to reinterpret the existing contents of the field - which are
  already valid JSON - as JSON objects.
 
  2. Please convert the value that I have into a JSON object according
  to a type-specific rule.  For example, you might have a text field in
  which you store arbitrary strings.  But perhaps you need to store
  structured data there, so once you upgrade to 9.2 you might want to
  wrap up your strings inside JSON strings.
 
  Now there is some subtle ambiguity here because in some cases the
  behavior can be exactly the same in both cases.  For example most
  numeric values will get the same treatment either way, but NaN cannot.
   If you do mynumeric::json, interpretation #1 will fail for NaN but
  interpretation #2 will probably produce something like NaN.
  Similarly if the type is boolean, we could likely get away with
  producing true and false for either interpretation.  If the type is
  hstore, then #1 is going to fail, but #2 is going to convert 1=2
  to {1:2}.  So in general it might seem that #2 is the better
  interpretation, because it gives many casts a sensible interpretation
  that is otherwise lacking.
 
  But, what about text?  It seems to me that users will count on the
  fact that '[1,2,3]'::text::json is going to produce [1,2,3] (a JSON
  array containing the first three numbers) and NOT [1,2,3] (a JSON
  string containing 7 characters).  And that is emphatically
  interpretation #1.

 Hm.  Well, that's a really good point although I kinda disagree with
 your assumption: I think it's much cleaner to have:
 select  '[1,2,3]'::int[]::json
 produce a json array.

 All types but text (record[] etc) would seem to use the type structure
 to define how the json gets laid out.  'text::json' is an exception,
 because there is an implied parse, which I'm starting to unfortunately
 think is the wrong behavior if you want to be able to make json datums
 out of sql datums: how do you create a vanilla json text datum?

 merlin


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



[HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Karl O. Pinc
Hi,

This patch (psql_remove_include.patch) eliminates 
the #include of psqlscan.c at the bottom of mainloop.c.

The attached patch uses the %top{} flex feature 
introduced in flex 2.5.30 released 2003-4-1.
(See the NEWS file for flex.)

The good news is that config/programs.m4
requires flex = 2.5.31.  The bad news 
is that RHEL 5 (released 2007-03-14
with a 13 year (10+3 years) support lifecycle)
has a flex (2.5.4a) that is too old for this patch. :-(
(At least this is what the changelog in flex
from the RHEL 5 srpm repo tells me.)
I don't know what this means.

The flex docs on my box, fwiw, don't mention %top{}
in the section describing Posix (in)compatibility.

The patch is against git head.  All the tests
pass, fwiw.  Because this depends on the toolchain
it wouldn't hurt to build on other architectures/
operating systems.

I'm thinking of exposing enough of the psql parser,
moving it to libpq, that any client-side app can
do what libpq does; given a bunch of sql
separated by semi-colons get the results
of all the statements.  This should also allow
the statement separation to be done on the client
side in libpq.  Although I don't imagine that this
will have a performance impact on the server side
it sounds like a first step toward pushing more of
the parsing onto the client.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

P.S.  Sent this patch to the list at Sep 27 00:43:28 Central time
and it never showed up so I'm sending again with a different
subject.  Info on the previous message:

Subject:Remove #include psqlscan.c from psql
message-id=1348724599.28442.0@mofo

diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 771fd71..e05b1cf 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o psqlscan.o \
 	$(WIN32RES)
 
 FLEXFLAGS = -Cfe -b -p -p
@@ -46,9 +46,6 @@ sql_help.c: sql_help.h ;
 sql_help.h: create_help.pl $(wildcard $(REFDOCDIR)/*.sgml)
 	$(PERL) $ $(REFDOCDIR) $*
 
-# psqlscan is compiled as part of mainloop
-mainloop.o: psqlscan.c
-
 psqlscan.c: psqlscan.l
 ifdef FLEX
 	$(FLEX) $(FLEXFLAGS) -o'$@' $
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index 16c65ec..3f55813 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -413,13 +413,3 @@ MainLoop(FILE *source)
 
 	return successResult;
 }	/* MainLoop() */
-
-
-/*
- * psqlscan.c is #include'd here instead of being compiled on its own.
- * This is because we need postgres_fe.h to be read before any system
- * include files, else things tend to break on platforms that have
- * multiple infrastructures for stdio.h and so on.	flex is absolutely
- * uncooperative about that, so we can't compile psqlscan.c on its own.
- */
-#include psqlscan.c
diff --git a/src/bin/psql/mainloop.h b/src/bin/psql/mainloop.h
index 134987a..1557187 100644
--- a/src/bin/psql/mainloop.h
+++ b/src/bin/psql/mainloop.h
@@ -9,6 +9,7 @@
 #define MAINLOOP_H
 
 #include postgres_fe.h
+#include psqlscan.h
 
 int			MainLoop(FILE *source);
 
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index 1208c8f..069c46f 100644
--- a/src/bin/psql/psqlscan.l
+++ b/src/bin/psql/psqlscan.l
@@ -37,8 +37,17 @@
  *
  *-
  */
+%}
+	/*
+	 * postgres_fe.h must be read before any system include files, else
+	 * things tend to break on platforms that have multiple
+	 * infrastructures for stdio.h and so on.
+	 */
+%top{
 #include postgres_fe.h
+}
 
+%{
 #include psqlscan.h
 
 #include ctype.h


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum stress-testing our system

2012-09-27 Thread Alvaro Herrera
Excerpts from Simon Riggs's message of jue sep 27 06:51:28 -0300 2012:
 On 26 September 2012 15:47, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
  Really, as far as autovacuum is concerned, it would be much more useful
  to be able to reliably detect that a table has been recently vacuumed,
  without having to request a 10ms-recent pgstat snapshot.  That would
  greatly reduce the amount of time autovac spends on pgstat requests.
 
 VACUUMing generates a relcache invalidation. Can we arrange for those
 invalidations to be received by autovac launcher, so it gets immediate
 feedback of recent activity without polling?

Hmm, this is an interesting idea worth exploring, I think.  Maybe we
should sort tables in the autovac worker to-do list by age of last
invalidation messages received, or something like that.  Totally unclear
on the details, but as I said, worth exploring.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Tom Lane
Karl O. Pinc k...@meme.com writes:
 This patch (psql_remove_include.patch) eliminates 
 the #include of psqlscan.c at the bottom of mainloop.c.

I don't really see that this is enough of an improvement to justify
depending on a non-portable flex feature.

 I'm thinking of exposing enough of the psql parser,
 moving it to libpq, that any client-side app can
 do what libpq does; given a bunch of sql
 separated by semi-colons get the results
 of all the statements.  This should also allow
 the statement separation to be done on the client
 side in libpq.  Although I don't imagine that this
 will have a performance impact on the server side
 it sounds like a first step toward pushing more of
 the parsing onto the client.

Quite honestly, I don't like that idea at all either.  It's bad enough
that psql has to know so much low-level SQL syntax.  If we start
encouraging other programs to know that, we're going to be locked into
never again making any lexical-level changes.  Why exactly is pushing
parsing onto the client a good idea?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Karl O. Pinc
On 09/27/2012 10:07:48 AM, Tom Lane wrote:
 Karl O. Pinc k...@meme.com writes:
  This patch (psql_remove_include.patch) eliminates 
  the #include of psqlscan.c at the bottom of mainloop.c.
 
 I don't really see that this is enough of an improvement to justify
 depending on a non-portable flex feature.

Ok.  (I had no idea it was non-portable.  The flex
docs don't mention this.)

 
  I'm thinking of exposing enough of the psql parser,
  moving it to libpq, that any client-side app can
  do what libpq does; given a bunch of sql
  separated by semi-colons get the results
  of all the statements.  This should also allow
  the statement separation to be done on the client
  side in libpq.  Although I don't imagine that this
  will have a performance impact on the server side
  it sounds like a first step toward pushing more of
  the parsing onto the client.
 
 Quite honestly, I don't like that idea at all either.  It's bad 
 enough
 that psql has to know so much low-level SQL syntax.  If we start
 encouraging other programs to know that, we're going to be locked 
 into
 never again making any lexical-level changes.  Why exactly is 
 pushing
 parsing onto the client a good idea?

There's really 2 ideas.  I don't care about this one:
pushing parsing/anything onto the client is a good idea because
clients scale horizontally and so performance is improved.

What I'm thinking of in libpq is the ability to give it big string
with many sql statements and have it hand back each statement
so the client can then submit it to the server for execution.
What I really _want_ is to be able get a bit string of many
sql statements from the user and return the results, statuses,
etc. of executing each statement.  Just what psql does when,
say, fed a file from stdin.

The reason I want this is because I don't want to have to
rewrite the sql parser in PHP for inclusion in phpPgAdmin.
(I did this once, and it was such a big ugly patch
it never got around to getting into the mainline phpPgAdmin.)
phpPgAdmin (pgAdmin/ front-end of your choice)
should be able process a string of sql statements in the
same way that psql does, but currently the only way to
do this is to rewrite the parser in each application.
Much better to have libpq provide the functionality,
although I suppose it's possible to push this into the
server.


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Alvaro Herrera
Excerpts from Karl O. Pinc's message of jue sep 27 12:29:53 -0300 2012:

 The reason I want this is because I don't want to have to
 rewrite the sql parser in PHP for inclusion in phpPgAdmin.
 (I did this once, and it was such a big ugly patch
 it never got around to getting into the mainline phpPgAdmin.)
 phpPgAdmin (pgAdmin/ front-end of your choice)
 should be able process a string of sql statements in the
 same way that psql does, but currently the only way to
 do this is to rewrite the parser in each application.
 Much better to have libpq provide the functionality,
 although I suppose it's possible to push this into the
 server.

This seems a worthy goal to me.

But I think I see what Tom objection to it is: if we export this
capability to libpq applications, then we set it in stone to a certain
extent: exactly how things are split would become part of the API, so to
speak.  Upgrading to a newer libpq could break application code that
worked with the previous release's by splitting things differently.

I don't currently have an opinion on whether this is a bad thing or not.
Who wants to argue for/against?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread John R Pierce

On 09/27/12 8:41 AM, Alvaro Herrera wrote:

But I think I see what Tom objection to it is: if we export this
capability to libpq applications, then we set it in stone to a certain
extent: exactly how things are split would become part of the API, so to
speak.  Upgrading to a newer libpq could break application code that
worked with the previous release's by splitting things differently.

I don't currently have an opinion on whether this is a bad thing or not.
Who wants to argue for/against?


I wonder if it shouldn't be in a separate 'helper' library, as it has no 
direct ties to any libpq internals.




--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XLogInsert scaling, revisited

2012-09-27 Thread Heikki Linnakangas

On 24.09.2012 21:06, Fujii Masao wrote:

The patch could be applied cleanly and the compile could be successfully done.


Thanks for the testing!


But when I ran initdb, I got the following assertion error:

--
$ initdb -D data --locale=C --encoding=UTF-8
The files belonging to this database system will be owned by user postgres.
This user must also own the server process.

The database cluster will be initialized with locale C.
The default text search configuration will be set to english.

creating directory data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... TRAP: FailedAssertion(!(((uint64)
currpos) % 8192= (((intptr_t) ((sizeof(XLogPageHeaderData))) + ((8)
- 1))  ~((intptr_t) ((8) - 1))) || rdata_len == 0), File: xlog.c,
Line: 1363)
sh: line 1: 29537 Abort trap: 6   /dav/hoge/bin/postgres
--single -F -O -c search_path=pg_catalog -c exit_on_error=true
template1  /dev/null
child process exited with exit code 134
initdb: removing data directory data
--

I got the above problem on MacOS:


Hmm, I cannot reproduce this on my Linux laptop. However, I think I see 
what the problem is: the assertion should assert that (*CurrPos* % 
XLOG_BLCKZ = SizeOfXLogShortPHD), not currpos. The former is an 
XLogRecPtr, the latter is a pointer. If the WAL buffers are aligned at 
8k boundaries, the effect is the same, but otherwise the assertion is 
just wrong. And as it happens, if O_DIRECT is defined, we align WAL 
buffers at XLOG_BLCKSZ. I think that's why I don't see this on my 
laptop. Does Mac OS X not define O_DIRECT?


Anyway, attached is a patch with that fixed.

- Heikki


xloginsert-scale-21.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 But I think I see what Tom objection to it is: if we export this
 capability to libpq applications, then we set it in stone to a certain
 extent: exactly how things are split would become part of the API, so to
 speak.  Upgrading to a newer libpq could break application code that
 worked with the previous release's by splitting things differently.

That's not exactly what the problem is: an upgrade could only break
*existing* application code if we had made a non-backward-compatible
change in SQL lexical rules, which hopefully we'd never do.

Rather, the problem is that the server might know about some newer
lexical feature, and so might the application, but if libpq is behind
the times then it's broken.  You can see an instance of this right now
over in the pgsql-jdbc list, where they're arguing about fixing the
JDBC driver to understand dollar quoting.  Application authors not
unreasonably think it should have heard of that by now.

The JDBC example is sort of an argument in favor of Karl's idea,
in that if client-side code is going to know this anyway then it would
be better if it were at least localized in one place.  But of course
JDBC is never going to depend on libpq (wrong language) so moving this
code into libpq isn't actually going to help them.

A larger point is that I don't believe this is actually going to help
anybody, because of mismatch of requirements not only implementation
language.  JDBC couldn't use a libpq lexer implementation even without
the language issue, because the context in which they're arguing about
this is finding and replacing JDBC-spec escape sequences, which libpq is
not going to know about.  I imagine PHP has got the same problem only
different.  Conversely, psql's lexer has a lot of psql-specific design
decisions, such as the need to handle backslash commands and include
files, that I don't think would belong in a libpq implementation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-27 Thread Robert Haas
On Wed, Sep 26, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net wrote:
 How many would EDB need for it to be useful?

Looks like we currently have about 160 hard-coded OIDs in our fork
that are not in PostgreSQL, across all catalogs.  Actually, there are
probably some things we could do to reduce that number, which might be
the smarter way to go.  But taken at face value I suppose that means
we would need a reservation of several hundred to allow for future
growth.

 I am somewhat opposed to the idea of an OID registry.  I can't see why
 the space of things people want to reserve OIDs for would be only tens
 rather than thousands.  There are surely plenty of extensions that
 would like to depend on specific other extensions, or on core.  If we
 legislate that hard-coded OIDs are the way to do that, I think we're
 going to end up with a lot of 'em.

 Maybe you have a better feel than I do for how many live addon types are out
 there. Maybe there are hundreds or thousands, but if there are I am
 blissfully unaware of them.

Well, that's a fair point.  There probably aren't.  But then again,
the proposed registry wouldn't only cover live projects.  We'd
probably have a decent number of people say: I can't do what I want
unless I have an OID.  And then the project becomes dormant or
obsolescent but we still have the OID reservation and there's not
really any politically feasible way of recovering the address space.
I can't help thinking that it sounds a little bit like what IANA does,
and the problems they face, except with 2^16 OIDs instead of 2^32 IP
addresses.  Admittedly there should be a lot less demand for type OIDs
than for IP addresses, but the amount of address space we can allocate
without pain is also much smaller.

 I did like the alternative idea upthread of UUIDs for types which would give
 them a virtually unlimited space.

Yeah, me too.  That doesn't require a centralized authority (hence, no
debates here about whether a given extension is important enough to
merit an allocation of a given size), doesn't move us further in the
direction of exposing the database's internal identifiers as a concept
that users have to care about, ad provides essentially infinite
address space.  There's more engineering work involved but sometimes
more engineering work means a better result.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-09-27 Thread Noah Misch
On Tue, Sep 18, 2012 at 05:52:51PM +0200, Marco Nenciarini wrote:
 please find attached the refreshed v1 patch.

I perused this version in comparison to the last version I reviewed, finding
minor problems.  First, a warning:

tablecmds.c: In function `ATExecAddConstraint':
tablecmds.c:5898: warning: `fk_element_type' may be used uninitialized in this 
function
tablecmds.c:5898: note: `fk_element_type' was declared here

I don't see an actual bug; add a dead store to silence the compiler.

 *** a/src/test/regress/parallel_schedule
 --- b/src/test/regress/parallel_schedule
 *** test: event_trigger
 *** 94,100 
   # --
   # Another group of parallel tests
   # --
 ! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops 
 combocid tsearch tsdicts foreign_data window xmlmap functional_deps 
 advisory_lock json
   
   # --
   # Another group of parallel tests
 --- 94,100 
   # --
   # Another group of parallel tests
   # --
 ! test: select_views portals_p2 foreign_key cluster dependency guc bitmapops 
 combocid tsearch tsdicts foreign_data window xmlmap functional_deps 
 advisory_lock element_foreign_key

Keep that json test.

 + errmsg(array ELEMENT foreign keys only support 
 NO ACTION 
 +and RESTRICT actions)));

Project style is not to break message literals; instead, let the line run
long.  There are a few more examples of this in your patch.


Those problems are isolated and do not impugn design, so committer time would
be just as well-spent on the latest version.  As such, I'm marking the patch
Ready for Committer.  Thanks to Rafal Pietrak for his helpful review.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-27 Thread David Johnston
 
  I did like the alternative idea upthread of UUIDs for types which
  would give them a virtually unlimited space.
 
 Yeah, me too.  That doesn't require a centralized authority (hence, no
 debates here about whether a given extension is important enough to merit
 an allocation of a given size), doesn't move us further in the direction
of
 exposing the database's internal identifiers as a concept that users have
to
 care about, ad provides essentially infinite address space.  There's more
 engineering work involved but sometimes more engineering work means a
 better result.
 

Random thought from the sideline...

GIT is able to provide assurances as to content because it creates a hash.
Now, for a function PostgreSQL could hash the catalog entry (including
function body) and return than as proof that said function is the same as
one installed in some other database or published publically.  I do not know
enough about the other objects to know if something similar is possible but
maybe this will spark someone else's thoughts.

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Karl O. Pinc
On 09/27/2012 11:02:42 AM, Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:

 A larger point is that I don't believe this is actually going to help
 anybody, because of mismatch of requirements not only implementation
 language.  JDBC couldn't use a libpq lexer implementation even 
 without
 the language issue, because the context in which they're arguing 
 about
 this is finding and replacing JDBC-spec escape sequences, which libpq
 is
 not going to know about.  I imagine PHP has got the same problem only
 different.  Conversely, psql's lexer has a lot of psql-specific 
 design
 decisions, such as the need to handle backslash commands and include
 files, that I don't think would belong in a libpq implementation.

Well no, I'm not at all interested in escape sequences.
I want to take sql directly from the user and execute it.
Right now I can take only one statement at a time.
And this is true of any human-facing application.

I'm not looking for a general purpose solution, although
it did occur to me that the psql variable substitution
mechanism could be exposed.

But what I really want is not an exposed parser.  What
I really want is to be able to get results from all the
statements passed to PQexec (et-al), not just the
last statement.  This could be done without exposing
the parser, but it does mean having a parser in libpq.
Since psql links to libpq anyway my plan was to
move the parser entirely out of psql into libpq and
have an undocumented internal interface so that
psql can do the escaping/variable substitution stuff.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Karl O. Pinc
On 09/27/2012 11:02:42 AM, Tom Lane wrote:

 Rather, the problem is that the server might know about some newer
 lexical feature, and so might the application, but if libpq is behind
 the times then it's broken. 

If the application knows about the newer feature and wants
to use it, is it unreasonable to ask the application
be linked against a newer libpq?


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-27 Thread Magnus Hagander
On Sep 27, 2012 9:52 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Sep 26, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net
wrote:
  How many would EDB need for it to be useful?

 Looks like we currently have about 160 hard-coded OIDs in our fork
 that are not in PostgreSQL, across all catalogs.  Actually, there are
 probably some things we could do to reduce that number, which might be
 the smarter way to go.  But taken at face value I suppose that means
 we would need a reservation of several hundred to allow for future
 growth.

I'm not sure that's a way we really want to go down. How do we define which
third party vendors would get to reserve oids? And how many? And under what
other potential terms?

Seems like we'd set ourselves up for endless discussions and bike
shedding...

  I am somewhat opposed to the idea of an OID registry.  I can't see why
  the space of things people want to reserve OIDs for would be only tens
  rather than thousands.  There are surely plenty of extensions that
  would like to depend on specific other extensions, or on core.  If we
  legislate that hard-coded OIDs are the way to do that, I think we're
  going to end up with a lot of 'em.
 
  Maybe you have a better feel than I do for how many live addon types
are out
  there. Maybe there are hundreds or thousands, but if there are I am
  blissfully unaware of them.

 Well, that's a fair point.  There probably aren't.  But then again

There are probably more than we know. Many many internal things at
different organizations.

 the proposed registry wouldn't only cover live projects.  We'd
 probably have a decent number of people say: I can't do what I want
 unless I have an OID.  And then the project becomes dormant or
 obsolescent but we still have the OID reservation and there's not
 really any politically feasible way of recovering the address space.
 I can't help thinking that it sounds a little bit like what IANA does,
 and the problems they face, except with 2^16 OIDs instead of 2^32 IP
 addresses.  Admittedly there should be a lot less demand for type OIDs
 than for IP addresses, but the amount of address space we can allocate
 without pain is also much smaller.

Yeah, I think that would rapidly turn into a pain point.

  I did like the alternative idea upthread of UUIDs for types which would
give
  them a virtually unlimited space.

 Yeah, me too.  That doesn't require a centralized authority (hence, no
 debates here about whether a given extension is important enough to
 merit an allocation of a given size), doesn't move us further in the
 direction of exposing the database's internal identifiers as a concept
 that users have to care about, ad provides essentially infinite
 address space.  There's more engineering work involved but sometimes
 more engineering work means a better result.

Yeah, seems like it would work much better long term.

/Magnus


Re: [HACKERS] patch: shared session variables

2012-09-27 Thread Pavel Stehule
Hello

2012/9/24 Heikki Linnakangas hlinnakan...@vmware.com:
 Having read through this thread, the consensus seems to be that we don't
 want this patch as it is (and I agree with that).

 As I understand it, you are trying to solve two problems:

 1. Passing parameters to a DO statement. You could quote the parameters and
 inline them in the block itself in the client, but quoting can be difficult
 to do correctly and is not as readable anyway, so I agree it would be good
 to be able to pass them separately.

 2. Returning results from a DO statement. At the moment, a DO statement is a
 utility statement, so it cannot return a result set.

 Two different approaches to these problems have been discussed that have
 some support:

 A) WITH FUNCTION syntax, to allow defining a temporary function for a
 single query, similar to the current WITH syntax.
 (http://archives.postgresql.org/pgsql-hackers/2012-07/msg00426.php)

 B) DO ... USING parameter list syntax. This is a straightforward extension
 of the current DO syntax, just adding a parameter list to it. Not sure how
 to return a result set from this, perhaps also support a RETURNS keyword,
 similar to CREATE FUNCTION.

 I'm ok with either of those approaches. A) would be more flexible, while B)
 would be straightforward extension of what we already have.

 I'm marking this patch as rejected in the commitfest app. Please pursue the
 WITH FUNCTION or DO ... USING syntax instead.

A basic discussion should be about a character of DO statements.  What
it should be? Temporary function or temporary stored procedure. Both
variants are correct. I prefer idea, where DO is temporary procedure.
A reply on this question solves a question about returning result from
DO statement. We can support more calling context for DO statements
like we do with general functions - and then both your @A and @B
variants should be possible supported.

A blocker for @B is unsupported parametrization for utility
statements. If we can support a CALL statement, then we will have
solution for parametrized DO statement too.

From my perspective, missing parametrized DO is not blocker for
anything. Is not super elegant, but workaround needs only a few lines
more. \gsets solves last missing functionality.

So I would to close this topic - we should to implement procedures and
CALL statement first - this functionality is relative clean - a base
is described by ANSI/SQL - and as next step we can reimplement
anynymous (temporary) functions or procedures.

Regards

Pavel


 Thanks!


 On 31.08.2012 21:27, Pavel Stehule wrote:

 2012/8/31 Dimitri Fontainedimi...@2ndquadrant.fr:

 Pavel Stehulepavel.steh...@gmail.com  writes:

 Pavel, you didn't say what you think about the WITH FUNCTION proposal?


 I don't like it - this proposal is too lispish - it is not SQL


 We're not doing lambda here, only extending a facility that we rely on
 today. The function would be named, for one. I don't know what you mean
 by SQL being lispish here, and I can't imagine why it would be something
 to avoid.

 And you didn't say how do you want to turn a utility statement into
 something that is able to return a result set.


 if we support real procedures ala sybase procedures (MySQL, MSSQL..)
 - then we can return result with same mechanism - there are no
 significant difference between DO and CALL statements - you don't know
 what will be result type before you call it.


 Currently we don't have CALL, and we have DO which is not a query but a
 utility statement. Are you proposing to implement CALL? What would be
 the difference between making DO a query and having CALL?


 defacto a CALL statement implementation can solve this issue.

 The core of this issue is an impossibility using parameters for
 utility statements.  CALL and DO are utility statements - and if we
 can use parameters for CALL, then we can do it for DO too.

 CALL statement starts a predefined batch - inside this batch, you can
 do anything - can use COMMIT, ROLLBACK, SELECTs, ... DO is some batch
 with immediate start. Sure, there is relative significant between
 stored procedures implemented in popular RDBMS and although I don't
 like T-SQL too much, I like sybase concept of stored procedures - it
 is strong and very useful for maintaining tasks.

 Regards

 Pavel





 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


 - Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal - assign result of query to psql variable

2012-09-27 Thread Pavel Stehule
Hello

2012/9/21 Shigeru HANADA shigeru.han...@gmail.com:
 Hi Pavel,

 (2012/09/21 2:01), Pavel Stehule wrote:
 - Is it intentional that \gset can set special variables such as
 AUTOCOMMIT and HOST?  I don't see any downside for this behavior,
 because \set also can do that, but it is not documented nor tested at all.


 I use a same SetVariable function, so a behave should be same

 It seems reasonable.

 Document
 
 - Adding some description of \gset command, especially about limitation
 of variable list, seems necessary.
 - In addition to the meta-command section, Advanced features section
 mentions how to set psql's variables, so we would need some mention
 there too.
 - The term target list might not be familiar to users, since it
 appears in only sections mentioning PG internal relatively.  I think
 that the feature described in the section Retrieving Query Results in
 ECPG document is similar to this feature.
 http://www.postgresql.org/docs/devel/static/ecpg-variables.html

 I invite any proposals about enhancing documentation. Personally I am
 a PostgreSQL developer, so I don't known any different term other than
 target list - but any user friendly description is welcome.

 How about to say stores the query's result output into variable?
 Please see attached file for my proposal.  I also mentioned about 1-row
 limit and omit of variable.

should be


 Coding
 ==
 The code follows our coding conventions.  Here are comments for coding.

 - Some typo found in comments, please see attached patch.
 - There is a code path which doesn't print error message even if libpq
 reported error (PGRES_BAD_RESPONSE, PGRES_NONFATAL_ERROR,
 PGRES_FATAL_ERROR) in StoreQueryResult.  Is this intentional?  FYI, ecpg
 prints bad response message for those errors.

 yes - it is question. I use same pattern like PrintQueryResult, but
 bad response message should be used.

 I am sending updated patch

 It seems ok.

 BTW, as far as I see, no psql backslash command including \setenv (it
 was added in 9.2) has regression test in core (I mean src/test/regress).
  Is there any convention about this issue?  If psql backslash commands
 (or any psql feature else) don't need regression test, we can remove
 psql.(sql|out).
 # Of course we need to test new feature by hand.

It is question for Tom or David - only server side functionalities has
regress tests. But result of some backslash command is verified in
other regress tests. I would to see some regression tests for this
functionality.


 Anyway, IMO the name psql impresses larger area than the patch
 implements.  How about to rename psql to psql_cmd or backslash_cmd than
 psql as regression test name?


I have no idea - psql_cmd is good name

Regards

Pavel

 --
 Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Tom Lane
Karl O. Pinc k...@meme.com writes:
 What I'm thinking of in libpq is the ability to give it big string
 with many sql statements and have it hand back each statement
 so the client can then submit it to the server for execution.
 What I really _want_ is to be able get a bit string of many
 sql statements from the user and return the results, statuses,
 etc. of executing each statement.  Just what psql does when,
 say, fed a file from stdin.

Just as a note --- I believe you can get that result today with
PQsendQuery followed by a PQgetResult loop.  There's no need to
provide an API that splits the string first.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_signal_backend() asymmetry

2012-09-27 Thread Alvaro Herrera
Excerpts from Noah Misch's message of mié sep 26 17:54:43 -0300 2012:
 I'm marking this patch Ready for Committer.  I suggest backpatching it to 9.2;
 the patch corrects an oversight in 9.2 changes.  There's more compatibility
 value in backpatching than in retaining distinct behavior for 9.2 only.
 
 On Thu, Jun 28, 2012 at 09:32:41AM -0700, Josh Kupershmidt wrote:
  ! if (!superuser())
{
/*
  !  * Since the user is not superuser, check for matching roles.
 */
  ! if (proc-roleId != GetUserId())
  ! return SIGNAL_BACKEND_NOPERMISSION;
}
 
 I would have collapsed the conditionals and deleted the obvious comment:
 
 if (!(superuser() || proc-roleId == GetUserId()))
 return SIGNAL_BACKEND_NOPERMISSION;
 
 The committer can do that if desired; no need for another version.

Thanks, I pushed to HEAD and 9.2 with the suggested tweaks.

Documentation doesn't specify the behavior of the non-superuser case
when there's trouble, so I too think the behavior change in 9.2 is okay.
I am less sure about it needing documenting.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Tom Lane
Karl O. Pinc k...@meme.com writes:
 On 09/27/2012 11:02:42 AM, Tom Lane wrote:
 Rather, the problem is that the server might know about some newer
 lexical feature, and so might the application, but if libpq is behind
 the times then it's broken. 

 If the application knows about the newer feature and wants
 to use it, is it unreasonable to ask the application
 be linked against a newer libpq?

Well, see the JDBC example: there is no newer driver with the desired
feature for applications to link against.  Even if all client-side code
with such knowledge stays perfectly in sync as far as upstream sources
are concerned, there are any number of reasons why particular
installations might have copies of varying vintages.  It's not really
a situation that I want to get into more than necessary.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-27 Thread Robert Haas
On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander mag...@hagander.net wrote:
 I'm not sure that's a way we really want to go down. How do we define which
 third party vendors would get to reserve oids? And how many? And under what
 other potential terms?

 Seems like we'd set ourselves up for endless discussions and bike
 shedding...

Not really.  I'm only proposing that it would be nice to have a block
of OIDs that core agrees not to assign for any other purpose, not that
we dole out specific ones to specific companies.  There's no reason
why, for example, EnterpriseDB's fork can't use OIDs from the same
reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster
Data's fork - those are all distinct projects.  All might need private
OIDs but they can all come from the same range because the code bases
don't mingle.

That having been said, we've gotten this far without having any
terrible trouble about this, so maybe it's not worth worrying about.
It's a nice-to-have, not a big deal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql, remove include of psqlscan.c

2012-09-27 Thread Karl O. Pinc
On 09/27/2012 02:28:49 PM, Tom Lane wrote:
 Karl O. Pinc k...@meme.com writes:

  What I really _want_ is to be able get a bit string of many
  sql statements from the user and return the results, statuses,
  etc. of executing each statement.  Just what psql does when,
  say, fed a file from stdin.
 
 Just as a note --- I believe you can get that result today with
 PQsendQuery followed by a PQgetResult loop.  There's no need to
 provide an API that splits the string first.

Ah.  *sigh*  I should pay better attention.  Sorry for the
distraction.


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum stress-testing our system

2012-09-27 Thread Simon Riggs
On 27 September 2012 15:57, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Excerpts from Simon Riggs's message of jue sep 27 06:51:28 -0300 2012:
 On 26 September 2012 15:47, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
  Really, as far as autovacuum is concerned, it would be much more useful
  to be able to reliably detect that a table has been recently vacuumed,
  without having to request a 10ms-recent pgstat snapshot.  That would
  greatly reduce the amount of time autovac spends on pgstat requests.

 VACUUMing generates a relcache invalidation. Can we arrange for those
 invalidations to be received by autovac launcher, so it gets immediate
 feedback of recent activity without polling?

 Hmm, this is an interesting idea worth exploring, I think.  Maybe we
 should sort tables in the autovac worker to-do list by age of last
 invalidation messages received, or something like that.  Totally unclear
 on the details, but as I said, worth exploring.

Just put them to back of queue if an inval is received.

There is already support for listening and yet never generating to
relcache inval messages.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oid registry

2012-09-27 Thread Magnus Hagander
On Thu, Sep 27, 2012 at 10:02 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 27, 2012 at 2:34 PM, Magnus Hagander mag...@hagander.net wrote:
 I'm not sure that's a way we really want to go down. How do we define which
 third party vendors would get to reserve oids? And how many? And under what
 other potential terms?

 Seems like we'd set ourselves up for endless discussions and bike
 shedding...

 Not really.  I'm only proposing that it would be nice to have a block
 of OIDs that core agrees not to assign for any other purpose, not that
 we dole out specific ones to specific companies.  There's no reason

Ah, ok. In that case I agree, that wouldn't be open for a lot of bikeshedding.

 why, for example, EnterpriseDB's fork can't use OIDs from the same
 reserved block as PostgreSQL-XC's fork or Greenplum's fork or Aster
 Data's fork - those are all distinct projects.  All might need private
 OIDs but they can all come from the same range because the code bases
 don't mingle.

 That having been said, we've gotten this far without having any
 terrible trouble about this, so maybe it's not worth worrying about.
 It's a nice-to-have, not a big deal.

Yeah, there's got to be a whole lot of other very much more
complicated things you have to do with each new major version :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Andrew Dunstan


On 09/27/2012 10:36 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 09/27/2012 09:22 AM, Robert Haas wrote:

Maybe I am being too pedantic about this and there is a way to make it
all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.

OK. I think that's a very good point. I guess I was kinda swept away by
this being suggested by a couple of influential people.

Well, that doesn't make it wrong, it just means there's more work
needed.  I'm not that thrilled with magic assumptions about function
names either; schema search path issues, for example, will make that
dangerous.  We've gone to considerable lengths to avoid embedding
assumptions about operator names, and assumptions about function names
aren't any better.

There are at least three ways we could use the cast machinery for this:

(1) Reject Robert's assumption that we have to support both
interpretations for every cast situation.  For instance, it doesn't
seem that unreasonable to me to insist that you have to cast to text
and then to json if you want the literal-reinterpretation behavior.
The main problem then is figuring out a convenient way to provide
interpretation #2 for text itself.



The trouble is, ISTM, that both things seem equally intuitive. You could 
easily argue that x::text::json means take x as text and treat it as 
json, or that it means take x as text and produce a valid json value 
from it by escaping and quoting it. It's particularly ambiguous when x 
is itself already a text value. If we go this way I suspect we'll 
violate POLA for a good number of users.




(2) Add another hidden argument to cast functions, or perhaps repurpose
one of the ones that exist now.  This is likely to come out rather ugly
because of the need to shoehorn it into an API that's already suffered
a couple of rounds of after-the-fact additions, but it's certainly
possible in principle.  The main thing I'd want is to not define it
in a JSON-only fashion --- so the first thing is to be able to explain
the distinction we're trying to make in a type-independent way.


I agree with the ugly part of this analysis :-)



(3) Invent an auxiliary type along the lines of json_value and say
that you create a cast from foo to json_value when you want one
interpretation, or directly to json if you want the other.  Then
things like record_to_json would look for the appropriate type of cast.
This is a bit ugly because the auxiliary type has no reason to live
other than to separate the two kinds of cast, but it avoids creating
any new JSON-specific mechanisms in the type system.




I could accept this. The reason is that very few types are in fact going 
to need a gadget like this. Yes it's mildly ugly, but really fairly 
unobtrusive.


cheers

andrew



There might be some other ideas I'm not thinking of.



Yeah. You've done better than me though :-)

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2012-09-27 Thread Alvaro Herrera
Excerpts from Kohei KaiGai's message of lun sep 10 08:08:32 -0300 2012:

 As attached, I split off the original one into three portions; for set-schema,
 set-owner and rename-to. Please apply them in order of patch filename.
 Regarding to the error message, RenameErrorMsgAlreadyExists() was added
 to handle per object type messages in case when aclcheck_error() is not
 available to utilize.
 All the regression test is contained with the 1st patch to make sure the
 series of reworks does not change existing behaviors.

I checked this and noticed that the regression test has a couple of
failures -- see below.  I think we should remove the test for the first
two hunks, and fix the query for the final one to remove the offending
column.

The good news is that I ran this without applying the whole patch, only
the new regression test' files.  I didn't check the files in detail.

I have skimmed over the patches and they seem reasonable too; I will
look at them in more detail later.  I think we should go ahead and apply
the (tweaked) regression test alone, as a first step.

*** /pgsql/source/HEAD/src/test/regress/expected/alter_generic.out  
2012-09-27 17:23:05.0 -0300
--- 
/home/alvherre/Code/pgsql/build/HEAD/src/test/regress/results/alter_generic.out 
2012-09-27 17:29:21.0 -0300
***
*** 106,112 
  CREATE COLLATION alt_coll1 (locale = 'C');
  CREATE COLLATION alt_coll2 (locale = 'C');
  ALTER COLLATION alt_coll1 RENAME TO alt_coll2;  -- failed (name conflict)
! ERROR:  collation alt_coll2 for encoding SQL_ASCII already exists in 
schema alt_nsp1
  ALTER COLLATION alt_coll1 RENAME TO alt_coll3;  -- OK
  ALTER COLLATION alt_coll2 OWNER TO regtest_alter_user2;  -- failed (no role 
membership)
  ERROR:  must be member of role regtest_alter_user2
--- 106,112 
  CREATE COLLATION alt_coll1 (locale = 'C');
  CREATE COLLATION alt_coll2 (locale = 'C');
  ALTER COLLATION alt_coll1 RENAME TO alt_coll2;  -- failed (name conflict)
! ERROR:  collation alt_coll2 for encoding UTF8 already exists in schema 
alt_nsp1
  ALTER COLLATION alt_coll1 RENAME TO alt_coll3;  -- OK
  ALTER COLLATION alt_coll2 OWNER TO regtest_alter_user2;  -- failed (no role 
membership)
  ERROR:  must be member of role regtest_alter_user2
***
*** 125,131 
  ALTER COLLATION alt_coll3 SET SCHEMA alt_nsp2;  -- failed (not owner)
  ERROR:  must be owner of collation alt_coll3
  ALTER COLLATION alt_coll2 SET SCHEMA alt_nsp2;  -- failed (name conflict)
! ERROR:  collation alt_coll2 for encoding SQL_ASCII already exists in 
schema alt_nsp2
  RESET SESSION AUTHORIZATION;
  SELECT n.nspname, c.collname, a.rolname
FROM pg_collation c, pg_namespace n, pg_authid a
--- 125,131 
  ALTER COLLATION alt_coll3 SET SCHEMA alt_nsp2;  -- failed (not owner)
  ERROR:  must be owner of collation alt_coll3
  ALTER COLLATION alt_coll2 SET SCHEMA alt_nsp2;  -- failed (name conflict)
! ERROR:  collation alt_coll2 for encoding UTF8 already exists in schema 
alt_nsp2
  RESET SESSION AUTHORIZATION;
  SELECT n.nspname, c.collname, a.rolname
FROM pg_collation c, pg_namespace n, pg_authid a
***
*** 334,341 
ORDER BY nspname, opfname;
   nspname  | opfname  | amname |   rolname   
  --+--++-
!  alt_nsp1 | alt_opc1 | hash   | kaigai
!  alt_nsp1 | alt_opc2 | hash   | kaigai
   alt_nsp1 | alt_opf2 | hash   | regtest_alter_user2
   alt_nsp1 | alt_opf3 | hash   | regtest_alter_user1
   alt_nsp1 | alt_opf4 | hash   | regtest_alter_user2
--- 334,341 
ORDER BY nspname, opfname;
   nspname  | opfname  | amname |   rolname   
  --+--++-
!  alt_nsp1 | alt_opc1 | hash   | alvherre
!  alt_nsp1 | alt_opc2 | hash   | alvherre
   alt_nsp1 | alt_opf2 | hash   | regtest_alter_user2
   alt_nsp1 | alt_opf3 | hash   | regtest_alter_user1
   alt_nsp1 | alt_opf4 | hash   | regtest_alter_user2

==

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

2012-09-27 Thread Pavel Stehule
Hello

I reduced this patch as just plpgsql (SPI) problem solution.

Correct generic solution needs a discussion about enhancing our libpq
interface - we can take a number of returned rows, but we should to
get number of processed rows too. A users should to parse this number
from completationTag, but this problem is not too hot and usually is
not blocker, because anybody can process completationTag usually.

So this issue is primary for PL/pgSQL, where this information is not
possible. There is precedent - CREATE AS SELECT (thanks for sample
:)), and COPY should to have same impact on ROW_COUNT like this
statement (last patch implements it).

Regards

Pavel



2012/9/21 Amit Kapila amit.kap...@huawei.com:
 On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:

 Basic stuff:
 
 - Patch applies OK. but offset difference in line numbers.
 - Compiles with errors in contrib [pg_stat_statements, sepgsql] modules
 - Regression failed; one test-case in COPY due to incomplete test-case
 attached patch. – same as reported by Heikki

fixed patch is in attachment

 After modifications:
 ---
 - Patch applies OK
 - Compiles cleanly without any errors/warnings
 - Regression tests pass.


 What it does:
 --
 Modification to get the number of processed rows evaluated via SPI. The
 changes are to add extra parameter in ProcessUtility to get the number of
 rows processed by COPY command.

 Code Review Comments:
 -
 1. New parameter is added to ProcessUtility_hook_type function
but the functions which get assigned to these functions like
sepgsql_utility_command, pgss_ProcessUtility, prototype  definition is
 not modified.

 Functionality is not fixed correctly for hook functions, In function 
 pgss_ProcessUtility
 for bellow snippet of code processed parameter is passed NULL, as well as not 
 initialized.
 because of this when pg_stat_statements extention is utilized COPY command 
 is giving garbage values.
 if (prev_ProcessUtility)
 prev_ProcessUtility(parsetree, queryString, params,
 dest, completionTag, 
 context, NULL);
 else
 standard_ProcessUtility(parsetree, queryString, params,
 dest, completionTag, 
 context, NULL);

 Testcase is attached.
 In this testcase table has only 1000 records but it show garbage value.
 postgres=# show shared_preload_libraries ;
  shared_preload_libraries
 --
  pg_stat_statements
 (1 row)
 postgres=# CREATE TABLE tbl (a int);
 CREATE TABLE
 postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
 INSERT 0 1000
 postgres=# do $$
 declare r int;
 begin
   copy tbl to '/home/kiran/copytest.csv' csv;
   get diagnostics r = row_count;
   raise notice 'exported % rows', r;
   truncate tbl;
   copy tbl from '/home/kiran/copytest.csv' csv;
   get diagnostics r = row_count;
   raise notice 'imported % rows', r;
 end;
 $$ language plpgsql;
 postgres$#
 NOTICE:  exported 13281616 rows
 NOTICE:  imported 13281616 rows
 DO


 2. Why to add the new parameter if completionTag hold the number of
 processed tuple information; can be extracted

from it as follows:
 _SPI_current-processed = strtoul(completionTag + 7,
 NULL, 10);

this is basic question. I prefer a natural type for counter - uint64
instead text. And there are no simply way to get offset (7 in this
case)

 I agree with your point, but currently in few other places we are parsing 
 the completion tag for getting number of tuples processed. So may be in 
 future we can change those places as well. For example

yes, this step can be done in future - together with libpq enhancing.
Is not adequate change API (and break lot of extensions) for this
isolated problem. So I propose fix plpgsql issue, and add to ToDo -
enhance libpq to return a number of processed rows as number - but
this change breaking compatibility.

Pavel


 pgss_ProcessUtility
 {
 ..

 /* parse command tag to retrieve the number of affected rows. */
 if (completionTag 
 sscanf(completionTag, COPY  UINT64_FORMAT, rows) != 1)
 rows = 0;

 }

 _SPI_execute_plan
 {
 ..
 ..
 if (IsA(stmt, CreateTableAsStmt))
 {
 Assert(strncmp(completionTag, SELECT , 7) == 0);
 _SPI_current-processed = strtoul(completionTag + 7,
   
 NULL, 10);

 ..
 }


 With Regards,
 Amit Kapila.



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


copy-processed-rows-simple.patch
Description: 

Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Tue, Sep 25, 2012 at 5:08 PM, Brian Weaver cmdrcluel...@gmail.com wrote:
 Unless I misread the code, the tar format and streaming xlog are
 mutually exclusive. Considering my normal state of fatigue it's not
 unlikely. I don't want to have to set my wal_keep_segments
 artificially high just for the backup

Correct, you can't use both of those at the same time. That can
certainly be improved - but injecting a file into the tar from a
different process is far from easy. But one idea might be to just
stream the WAL into a *separate* tarfile in this case.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Brian Weaver cmdrcluel...@gmail.com writes:
 While researching the way streaming replication works I was examining
 the construction of the tar file header. By comparing documentation on
 the tar header format from various sources I certain the following
 patch should be applied to so the group identifier is put into thee
 header properly.

 Yeah, this is definitely wrong.

 While I realize that wikipedia isn't always the best source of
 information, the header offsets seem to match the other documentation
 I've found. The format is just easier to read on wikipedia

 The authoritative specification can be found in the pax page in the
 POSIX spec, which is available here:
 http://pubs.opengroup.org/onlinepubs/9699919799/

 I agree that the 117 number is bogus, and also that the magic ustar
 string is written incorrectly.  What's more, it appears that the latter
 error has been copied from pg_dump (but the 117 seems to be just a new
 bug in pg_basebackup).  I wonder what else might be wrong hereabouts :-(
 Will sit down and take a closer look.

 I believe what we need to do about this is:

 1. fix pg_dump and pg_basebackup output to conform to spec.

 2. make sure pg_restore will accept both conformant and
previous-generation files.

 Am I right in believing that we don't have any code that's expected to
 read pg_basebackup output?  We just feed it to tar, no?

Yes. We generate a .tarfile, but we have no tool that reads it
ourselves. (unlike pg_dump where we have pg_restore that actually
reads it)


 I'm a bit concerned about backwards compatibility issues.  It looks to
 me like existing versions of pg_restore will flat out reject files that
 have a spec-compliant ustar\0 MAGIC field.  Is it going to be
 sufficient if we fix this in minor-version updates, or are we going to
 need to have a switch that tells pg_dump to emit the incorrect old
 format?  (Ick.)

Do we officially support using an older pg_restore to reload a newer
dump? I think not? As long as we don't officially support that, I
think we'll be ok.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Brian Weaver
Magnus,

I probably just did a poor job of explaining what I wanted to try. I
was going to have the base backup open two connections; one to stream
the tar archive, the second to receive the wal files like
pg_receivexlog.

The wal files received on the second connection would be streamed to a
temporary file, with tar headers. Then when the complete tar archive
from the first header was complete received simply replay the contents
from the temporary file to append them to the tar archive.

Turns out that isn't necessary. It was an idea borne from my
misunderstanding of how the pg_basebackup worked. The archive will
include all the wal files if I make wal_keep_segments high enough.

-- Brian

On Thu, Sep 27, 2012 at 6:01 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Sep 25, 2012 at 5:08 PM, Brian Weaver cmdrcluel...@gmail.com wrote:
 Unless I misread the code, the tar format and streaming xlog are
 mutually exclusive. Considering my normal state of fatigue it's not
 unlikely. I don't want to have to set my wal_keep_segments
 artificially high just for the backup

 Correct, you can't use both of those at the same time. That can
 certainly be improved - but injecting a file into the tar from a
 different process is far from easy. But one idea might be to just
 stream the WAL into a *separate* tarfile in this case.


 --
  Magnus Hagander
  Me: http://www.hagander.net/
  Work: http://www.redpill-linpro.com/



-- 

/* insert witty comment here */


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Brian Weaver
OK, here is my attempt at patching and correcting the issue in this
thread. I have done my best to test to ensure that hot standby,
pg_basebackup, and pg_restore of older files work without issues. I
think this might be a larger patch that expected, I took some
liberties of trying to clean up a bit.

For example the block size '512' was scattered throughout the code
regarding the tar block size. I've replace instances of that with a
defined constant TAR_BLOCK_SIZE. I've likewise created other constants
and used them in place of raw numbers in what I hope makes the code a
bit more readable.

I've also used functions like strncpy(), strnlen(), and the like in
place of sprintf() where I could. Also instead of using sscanf() I
used a custom octal conversion routine which has a hard limit on how
many character it will process like strncpy()  strnlen().

I expect comments, hopefully they'll be positive.

-- Brian
-- 

/* insert witty comment here */


postgresql-tar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Fri, Sep 28, 2012 at 12:12 AM, Brian Weaver cmdrcluel...@gmail.com wrote:
 Magnus,

 I probably just did a poor job of explaining what I wanted to try. I
 was going to have the base backup open two connections; one to stream
 the tar archive, the second to receive the wal files like
 pg_receivexlog.

This is what --xlog=stream does.


 The wal files received on the second connection would be streamed to a
 temporary file, with tar headers. Then when the complete tar archive
 from the first header was complete received simply replay the contents
 from the temporary file to append them to the tar archive.

Ah, yeah, that should also work I guess. But you could also just leave
the the data in the temporary tarfile the whole time. IIRC, you can
just cat one tarfile to the end of another one, right?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm a bit concerned about backwards compatibility issues.  It looks to
 me like existing versions of pg_restore will flat out reject files that
 have a spec-compliant ustar\0 MAGIC field.  Is it going to be
 sufficient if we fix this in minor-version updates, or are we going to
 need to have a switch that tells pg_dump to emit the incorrect old
 format?  (Ick.)

 Do we officially support using an older pg_restore to reload a newer
 dump? I think not? As long as we don't officially support that, I
 think we'll be ok.

Well, for the -Fc format, we have an explicit version number, and
pg_restore is supposed to be able to read anything with current or prior
version number.  We don't bump the version number too often, but we've
definitely done it anytime we added new features at the file-format
level.  However, since the whole point of the -Ft format is to be
standard-compliant, people might be surprised if it fell over in a
backwards-compatibility situation.

Having said all that, I don't think we have a lot of choices here.
A tar format output option that isn't actually tar format has hardly
any excuse to live at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Fri, Sep 28, 2012 at 12:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm a bit concerned about backwards compatibility issues.  It looks to
 me like existing versions of pg_restore will flat out reject files that
 have a spec-compliant ustar\0 MAGIC field.  Is it going to be
 sufficient if we fix this in minor-version updates, or are we going to
 need to have a switch that tells pg_dump to emit the incorrect old
 format?  (Ick.)

 Do we officially support using an older pg_restore to reload a newer
 dump? I think not? As long as we don't officially support that, I
 think we'll be ok.

 Well, for the -Fc format, we have an explicit version number, and
 pg_restore is supposed to be able to read anything with current or prior
 version number.  We don't bump the version number too often, but we've
 definitely done it anytime we added new features at the file-format
 level.  However, since the whole point of the -Ft format is to be
 standard-compliant, people might be surprised if it fell over in a
 backwards-compatibility situation.

 Having said all that, I don't think we have a lot of choices here.
 A tar format output option that isn't actually tar format has hardly
 any excuse to live at all.

Yeah, that's what I'm thinking - it's really a bugfix...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Tom Lane
Brian Weaver cmdrcluel...@gmail.com writes:
 OK, here is my attempt at patching and correcting the issue in this
 thread. I have done my best to test to ensure that hot standby,
 pg_basebackup, and pg_restore of older files work without issues. I
 think this might be a larger patch that expected, I took some
 liberties of trying to clean up a bit.

 For example the block size '512' was scattered throughout the code
 regarding the tar block size. I've replace instances of that with a
 defined constant TAR_BLOCK_SIZE. I've likewise created other constants
 and used them in place of raw numbers in what I hope makes the code a
 bit more readable.

That seems possibly reasonable ...

 I've also used functions like strncpy(), strnlen(), and the like in
 place of sprintf() where I could. Also instead of using sscanf() I
 used a custom octal conversion routine which has a hard limit on how
 many character it will process like strncpy()  strnlen().

... but I doubt that this really constitutes a readability improvement.
Or a portability improvement.  strnlen for instance is not to be found
in Single Unix Spec v2 (http://pubs.opengroup.org/onlinepubs/007908799/)
which is what we usually take as our baseline assumption about which
system functions are available everywhere.  By and large, I think the
more different system functions you rely on, the harder it is to read
your code, even if some unusual system function happens to exactly match
your needs in particular places.  It also greatly increases the risk
of having portability problems, eg on Windows, or non-mainstream Unix
platforms.

But a larger point is that the immediate need is to fix bugs.  Code
beautification is a separate activity and would be better submitted as
a separate patch.  There is no way I'd consider applying most of this
patch to the back branches, for instance.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Ah, yeah, that should also work I guess. But you could also just leave
 the the data in the temporary tarfile the whole time. IIRC, you can
 just cat one tarfile to the end of another one, right?

Not if they're written according to spec, I think.  There is an EOF
marker consisting of 2 blocks of zeroes (which pg_dump is failing to
create correctly, but that's a bug not a feature).  Possibly you could
strip the EOF marker though, if the file join code is allowed to be
something smarter than cat.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Andrew Dunstan


On 09/27/2012 06:30 PM, Tom Lane wrote:

Magnus Hagander mag...@hagander.net writes:

On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:

I'm a bit concerned about backwards compatibility issues.  It looks to
me like existing versions of pg_restore will flat out reject files that
have a spec-compliant ustar\0 MAGIC field.  Is it going to be
sufficient if we fix this in minor-version updates, or are we going to
need to have a switch that tells pg_dump to emit the incorrect old
format?  (Ick.)

Do we officially support using an older pg_restore to reload a newer
dump? I think not? As long as we don't officially support that, I
think we'll be ok.

Well, for the -Fc format, we have an explicit version number, and
pg_restore is supposed to be able to read anything with current or prior
version number.  We don't bump the version number too often, but we've
definitely done it anytime we added new features at the file-format
level.  However, since the whole point of the -Ft format is to be
standard-compliant, people might be surprised if it fell over in a
backwards-compatibility situation.

Having said all that, I don't think we have a lot of choices here.
A tar format output option that isn't actually tar format has hardly
any excuse to live at all.



I agree, but it's possibly worth pointing out that GNU tar has no 
trouble at all processing the erroneous format, and the file program 
on my Linux system has no trouble recognizing it as a tar archive.


Nevertheless, I think we should fix all live versions of pg_dump make 
all live versions of pg-restore accept both formats.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Fri, Sep 28, 2012 at 12:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Ah, yeah, that should also work I guess. But you could also just leave
 the the data in the temporary tarfile the whole time. IIRC, you can
 just cat one tarfile to the end of another one, right?

 Not if they're written according to spec, I think.  There is an EOF
 marker consisting of 2 blocks of zeroes (which pg_dump is failing to
 create correctly, but that's a bug not a feature).  Possibly you could
 strip the EOF marker though, if the file join code is allowed to be
 something smarter than cat.

Hmm. Yeah. It seems gnu tar has --concatenate Not sure if it's
in the standard or a GNU extension though. But it says:


However, tar archives incorporate an end-of-file marker which must be
removed if the concatenated archives are to be read properly as one
archive. `--concatenate' removes the end-of-archive marker from the
target archive before each new archive is appended. If you use cat to
combine the archives, the result will not be a valid tar format
archive. If you need to retrieve files from an archive that was added
to using the cat utility, use the --ignore-zeros (-i) option.




-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/27/2012 06:30 PM, Tom Lane wrote:
 Having said all that, I don't think we have a lot of choices here.
 A tar format output option that isn't actually tar format has hardly
 any excuse to live at all.

 I agree, but it's possibly worth pointing out that GNU tar has no 
 trouble at all processing the erroneous format, and the file program 
 on my Linux system has no trouble recognizing it as a tar archive.

Well, they're falling back to assuming that the file is a pre-POSIX
tarfile, which is why you don't see string user/group names for
instance.

 Nevertheless, I think we should fix all live versions of pg_dump make 
 all live versions of pg-restore accept both formats.

I think it's clear that we should make all versions of pg_restore accept
either spelling of the magic string.  It's less clear that we should
change the output of pg_dump in back branches though.  I think the only
reason we'd not get complaints about that is that not that many people
are relying on tar-format output anyway.  Anybody who is would probably
be peeved if version 8.3.21 pg_restore couldn't read the output of
version 8.3.22 pg_dump.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Magnus Hagander
On Fri, Sep 28, 2012 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 09/27/2012 06:30 PM, Tom Lane wrote:
 Having said all that, I don't think we have a lot of choices here.
 A tar format output option that isn't actually tar format has hardly
 any excuse to live at all.

 I agree, but it's possibly worth pointing out that GNU tar has no
 trouble at all processing the erroneous format, and the file program
 on my Linux system has no trouble recognizing it as a tar archive.

 Well, they're falling back to assuming that the file is a pre-POSIX
 tarfile, which is why you don't see string user/group names for
 instance.

 Nevertheless, I think we should fix all live versions of pg_dump make
 all live versions of pg-restore accept both formats.

 I think it's clear that we should make all versions of pg_restore accept
 either spelling of the magic string.  It's less clear that we should
 change the output of pg_dump in back branches though.  I think the only
 reason we'd not get complaints about that is that not that many people
 are relying on tar-format output anyway.  Anybody who is would probably
 be peeved if version 8.3.21 pg_restore couldn't read the output of
 version 8.3.22 pg_dump.

There's no real point to using the tar format in pg_dump, really, is
there? Which is why I think most people just don't use it.

pg_basebackup in tar format is a much more useful thing, of course...

So we could fix just pg_basebackup in backbranches (since we never
read anything, it shouldn't be that big a problem), and then do
pg_dump in HEAD only?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Hannu Krosing

On 09/27/2012 09:18 PM, Andrew Dunstan wrote:


On 09/27/2012 10:36 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 09/27/2012 09:22 AM, Robert Haas wrote:

Maybe I am being too pedantic about this and there is a way to make it
all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.

OK. I think that's a very good point. I guess I was kinda swept away by
this being suggested by a couple of influential people.

Well, that doesn't make it wrong, it just means there's more work
needed.  I'm not that thrilled with magic assumptions about function
names either; schema search path issues, for example, will make that
dangerous.  We've gone to considerable lengths to avoid embedding
assumptions about operator names, and assumptions about function names
aren't any better.

There are at least three ways we could use the cast machinery for this:

(1) Reject Robert's assumption that we have to support both
interpretations for every cast situation.  For instance, it doesn't
seem that unreasonable to me to insist that you have to cast to text
and then to json if you want the literal-reinterpretation behavior.

Maybe cast not to text but to cstring for getting the text-is-already-json ?

That is, reuse the current type io as literal casts.

This way a cast of '{a: 1}'::json::text will fail, as this json value 
really does not

represent a text/string value.


The main problem then is figuring out a convenient way to provide
interpretation #2 for text itself.



The trouble is, ISTM, that both things seem equally intuitive. You 
could easily argue that x::text::json means take x as text and treat 
it as json, or that it means take x as text and produce a valid json 
value from it by escaping and quoting it. It's particularly ambiguous 
when x is itself already a text value. If we go this way I suspect 
we'll violate POLA for a good number of users.
It may be easier to sort this out if we think in terms of symmetry and 
unambiguity.


let's postulate that mytype::json::mytype and json::mytype::json should 
always reproduce the original result or they should fail.


so '[1,2,3]'::text::json::text === '[1,2,3]'::text with intermediate  
json being '[1,2,3]'


and '[1,2,3]'::json::text::json fails the json--text casts as 
'[1,2,3]'::json does not represent

a text value (in a similar way as '[1,2,3]'::json::date fails)

on the other hand '[1,2,3]'::json::int[]::json should succeed as there 
is a direct mapping to int array.





(3) Invent an auxiliary type along the lines of json_value and say
that you create a cast from foo to json_value when you want one
interpretation, or directly to json if you want the other.  Then
things like record_to_json would look for the appropriate type of cast.
This is a bit ugly because the auxiliary type has no reason to live
other than to separate the two kinds of cast, but it avoids creating
any new JSON-specific mechanisms in the type system.
As suggested above, this special type could be on the other side - the 
type cstring as

already used for type io functions

the main problem here is, that currently we do interpret ::text::json as it
were the type input function.

we do proper selective quoting when converting to back json

hannu=# create table jt(t text, j json);
CREATE TABLE
hannu=# insert into jt values ('[1,2]','[3,4]');
INSERT 0 1
hannu=# select row_to_json(jt) from jt;
   row_to_json
-
 {t:[1,2],j:[3,4]}
(1 row)

but we do automatic casting through cstring and json type input func 
when converting to json.


hannu=# select t::json, j::json from jt;
   t   |   j
---+---
 [1,2] | [3,4]
(1 row)

This should probably be cleaned up.



I could accept this. The reason is that very few types are in fact 
going to need a gadget like this. Yes it's mildly ugly, but really 
fairly unobtrusive.


cheers

andrew



There might be some other ideas I'm not thinking of.



Yeah. You've done better than me though :-)

cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Fri, Sep 28, 2012 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think it's clear that we should make all versions of pg_restore accept
 either spelling of the magic string.  It's less clear that we should
 change the output of pg_dump in back branches though.  I think the only
 reason we'd not get complaints about that is that not that many people
 are relying on tar-format output anyway.  Anybody who is would probably
 be peeved if version 8.3.21 pg_restore couldn't read the output of
 version 8.3.22 pg_dump.

 There's no real point to using the tar format in pg_dump, really, is
 there? Which is why I think most people just don't use it.

I think folks who know the tool realize that custom format is better.
But we've seen plenty of indications that novices sometimes choose tar
format.  For instance, people occasionally complain about its
8GB-per-file limit.

 pg_basebackup in tar format is a much more useful thing, of course...

Right.

 So we could fix just pg_basebackup in backbranches (since we never
 read anything, it shouldn't be that big a problem), and then do
 pg_dump in HEAD only?

Yeah, pg_basebackup is an entirely different tradeoff.  I think we
want to fix its problems in all branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD's replication WAS Switching timeline over streaming replication

2012-09-27 Thread Josh Berkus
On 9/26/12 6:17 PM, m...@rpzdesign.com wrote:
 Josh:
 
 The good part is you are the first person to ask for a copy
 and I will send you the hook code that I have and you can be a good sport
 and put it on GitHub, that is great, you can give us both credit for a
 joint effort, I do the code,
 you put it GitHub.

Well, I think it just makes sense for you to put it up somewhere public
so that folks can review it; if not Github, then somewhere else.  If
it's useful and well-written, folks will be interested.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] data to json enhancements

2012-09-27 Thread Andrew Dunstan


On 09/27/2012 06:58 PM, Hannu Krosing wrote:

On 09/27/2012 09:18 PM, Andrew Dunstan wrote:


On 09/27/2012 10:36 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 09/27/2012 09:22 AM, Robert Haas wrote:
Maybe I am being too pedantic about this and there is a way to 
make it

all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.
OK. I think that's a very good point. I guess I was kinda swept 
away by

this being suggested by a couple of influential people.

Well, that doesn't make it wrong, it just means there's more work
needed.  I'm not that thrilled with magic assumptions about function
names either; schema search path issues, for example, will make that
dangerous.  We've gone to considerable lengths to avoid embedding
assumptions about operator names, and assumptions about function names
aren't any better.

There are at least three ways we could use the cast machinery for this:

(1) Reject Robert's assumption that we have to support both
interpretations for every cast situation.  For instance, it doesn't
seem that unreasonable to me to insist that you have to cast to text
and then to json if you want the literal-reinterpretation behavior.
Maybe cast not to text but to cstring for getting the 
text-is-already-json ?


That is, reuse the current type io as literal casts.

This way a cast of '{a: 1}'::json::text will fail, as this json 
value really does not

represent a text/string value.


The main problem then is figuring out a convenient way to provide
interpretation #2 for text itself.



The trouble is, ISTM, that both things seem equally intuitive. You 
could easily argue that x::text::json means take x as text and treat 
it as json, or that it means take x as text and produce a valid json 
value from it by escaping and quoting it. It's particularly ambiguous 
when x is itself already a text value. If we go this way I suspect 
we'll violate POLA for a good number of users.
It may be easier to sort this out if we think in terms of symmetry and 
unambiguity.


let's postulate that mytype::json::mytype and json::mytype::json 
should always reproduce the original result or they should fail.



Where are all these casts from json going to come from? What is going to 
dequote and unescape strings, or turn objects into hstores? You're 
making this much bigger than what I had in mind. The advantage of Tom's 
option (3) that I liked is that it is very minimal. Any type can provide 
its own function for conversion to json. If it's there we use it, if 
it's not we use its standard text representation. Let's stick to the 
KISS principle.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-09-27 Thread Euler Taveira
On 27-09-2012 01:30, Amit Kapila wrote:
 I understood this point, but currently in documentation of Timelines, this 
 usecase is not documented (Section 24.3.5).
 
Timeline documentation was written during PITR implementation. There wasn't SR
yet. AFAICS it doesn't cite SR but is sufficiently generic (it use 'wal
records' term to explain the feature). Feel free to reword those paragraphs
mentioning SR.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: incorrect array offset in backend replication tar header

2012-09-27 Thread Brian Weaver
On Thu, Sep 27, 2012 at 6:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Brian Weaver cmdrcluel...@gmail.com writes:
 OK, here is my attempt at patching and correcting the issue in this
 thread. I have done my best to test to ensure that hot standby,
 pg_basebackup, and pg_restore of older files work without issues. I
 think this might be a larger patch that expected, I took some
 liberties of trying to clean up a bit.

 For example the block size '512' was scattered throughout the code
 regarding the tar block size. I've replace instances of that with a
 defined constant TAR_BLOCK_SIZE. I've likewise created other constants
 and used them in place of raw numbers in what I hope makes the code a
 bit more readable.

 That seems possibly reasonable ...

 I've also used functions like strncpy(), strnlen(), and the like in
 place of sprintf() where I could. Also instead of using sscanf() I
 used a custom octal conversion routine which has a hard limit on how
 many character it will process like strncpy()  strnlen().

 ... but I doubt that this really constitutes a readability improvement.
 Or a portability improvement.  strnlen for instance is not to be found
 in Single Unix Spec v2 (http://pubs.opengroup.org/onlinepubs/007908799/)
 which is what we usually take as our baseline assumption about which
 system functions are available everywhere.  By and large, I think the
 more different system functions you rely on, the harder it is to read
 your code, even if some unusual system function happens to exactly match
 your needs in particular places.  It also greatly increases the risk
 of having portability problems, eg on Windows, or non-mainstream Unix
 platforms.

 But a larger point is that the immediate need is to fix bugs.  Code
 beautification is a separate activity and would be better submitted as
 a separate patch.  There is no way I'd consider applying most of this
 patch to the back branches, for instance.

 regards, tom lane


Here's a very minimal fix then, perhaps it will be more palatable.
Even though I regret the effort I put into the first patch it's in my
employer's best interest that it's fixed. I'm obliged to try to
remediate the problem in something more acceptable to the community.

enjoy
-- 

/* insert witty comment here */


postgresql-tar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] setting per-database/role parameters checks them against wrong context

2012-09-27 Thread Peter Eisentraut
Example:

create temporary table foo (a int);
insert into foo values (1);
alter role peter set temp_buffers = 120;
ERROR:  22023: invalid value for parameter temp_buffers: 120
DETAIL:  temp_buffers cannot be changed after any temporary tables
have been accessed in the session.

Another example:

set log_statement_stats = on;
alter role peter set log_parser_stats = on;
ERROR:  22023: invalid value for parameter log_parser_stats: 1
DETAIL:  Cannot enable parameter when log_statement_stats is true.

Another example is that in =9.1, ALTER DATABASE ... SET search_path
would check the existence of the schema in the current database, but
that was done away with in 9.2.

The first example could probably be fixed if check_temp_buffers() paid
attention to the GucSource, but in the second case and in general there
doesn't seem to be a way to distinguish assigning per-database setting
and enacting per-database setting as a source.

Ideas?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

2012-09-27 Thread Amit Kapila
 On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
 Hello
 
 I reduced this patch as just plpgsql (SPI) problem solution.
 
 Correct generic solution needs a discussion about enhancing our libpq
 interface - we can take a number of returned rows, but we should to get
 number of processed rows too. A users should to parse this number from
 completationTag, but this problem is not too hot and usually is not
 blocker, because anybody can process completationTag usually.
 
 So this issue is primary for PL/pgSQL, where this information is not
 possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
 and COPY should to have same impact on ROW_COUNT like this statement
 (last patch implements it).

IMO now this patch is okay. I have marked it as Ready For Committer.

With Regards,
Amit Kapila.


 
 
 2012/9/21 Amit Kapila amit.kap...@huawei.com:
  On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
 
  Basic stuff:
  
  - Patch applies OK. but offset difference in line numbers.
  - Compiles with errors in contrib [pg_stat_statements, sepgsql]
  modules
  - Regression failed; one test-case in COPY due to incomplete
  test-case attached patch. – same as reported by Heikki
 
 fixed patch is in attachment
 
  After modifications:
  ---
  - Patch applies OK
  - Compiles cleanly without any errors/warnings
  - Regression tests pass.
 
 
  What it does:
  --
  Modification to get the number of processed rows evaluated via SPI.
  The changes are to add extra parameter in ProcessUtility to get the
  number of rows processed by COPY command.
 
  Code Review Comments:
  -
  1. New parameter is added to ProcessUtility_hook_type function
 but the functions which get assigned to these functions like
 sepgsql_utility_command, pgss_ProcessUtility, prototype 
  definition is not modified.
 
  Functionality is not fixed correctly for hook functions, In function
  pgss_ProcessUtility for bellow snippet of code processed parameter is
 passed NULL, as well as not initialized.
  because of this when pg_stat_statements extention is utilized COPY
 command is giving garbage values.
  if (prev_ProcessUtility)
  prev_ProcessUtility(parsetree, queryString, params,
  dest,
 completionTag, context, NULL);
  else
  standard_ProcessUtility(parsetree, queryString,
 params,
  dest,
  completionTag, context, NULL);
 
  Testcase is attached.
  In this testcase table has only 1000 records but it show garbage
 value.
  postgres=# show shared_preload_libraries ;
   shared_preload_libraries
  --
   pg_stat_statements
  (1 row)
  postgres=# CREATE TABLE tbl (a int);
  CREATE TABLE
  postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
  INSERT 0 1000
  postgres=# do $$
  declare r int;
  begin
copy tbl to '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'exported % rows', r;
truncate tbl;
copy tbl from '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'imported % rows', r;
  end;
  $$ language plpgsql;
  postgres$#
  NOTICE:  exported 13281616 rows
  NOTICE:  imported 13281616 rows
  DO
 
 
  2. Why to add the new parameter if completionTag hold the number of
  processed tuple information; can be extracted
 
 from it as follows:
  _SPI_current-processed = strtoul(completionTag
  + 7, NULL, 10);
 
 this is basic question. I prefer a natural type for counter - uint64
 instead text. And there are no simply way to get offset (7 in this
 case)
 
  I agree with your point, but currently in few other places we are
  parsing the completion tag for getting number of tuples processed. So
  may be in future we can change those places as well. For example
 
 yes, this step can be done in future - together with libpq enhancing.
 Is not adequate change API (and break lot of extensions) for this
 isolated problem. So I propose fix plpgsql issue, and add to ToDo -
 enhance libpq to return a number of processed rows as number - but
 this change breaking compatibility.
 
 Pavel
 
 
  pgss_ProcessUtility
  {
  ..
 
  /* parse command tag to retrieve the number of affected rows. */ if
  (completionTag 
  sscanf(completionTag, COPY  UINT64_FORMAT, rows) != 1)
  rows = 0;
 
  }
 
  _SPI_execute_plan
  {
  ..
  ..
  if (IsA(stmt, CreateTableAsStmt))
  {
  Assert(strncmp(completionTag, SELECT , 7) == 0);
  _SPI_current-processed = strtoul(completionTag + 7,
 
  NULL, 10);
 
  ..
  }
 
 
  With Regards,
  Amit Kapila.
 
 
 
  --
 

Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

2012-09-27 Thread Pavel Stehule
2012/9/28 Amit Kapila amit.kap...@huawei.com:
 On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
 Hello

 I reduced this patch as just plpgsql (SPI) problem solution.

 Correct generic solution needs a discussion about enhancing our libpq
 interface - we can take a number of returned rows, but we should to get
 number of processed rows too. A users should to parse this number from
 completationTag, but this problem is not too hot and usually is not
 blocker, because anybody can process completationTag usually.

 So this issue is primary for PL/pgSQL, where this information is not
 possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
 and COPY should to have same impact on ROW_COUNT like this statement
 (last patch implements it).

 IMO now this patch is okay. I have marked it as Ready For Committer.

Thank you very much for review

Regards

Pavel


 With Regards,
 Amit Kapila.




 2012/9/21 Amit Kapila amit.kap...@huawei.com:
  On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
 
  Basic stuff:
  
  - Patch applies OK. but offset difference in line numbers.
  - Compiles with errors in contrib [pg_stat_statements, sepgsql]
  modules
  - Regression failed; one test-case in COPY due to incomplete
  test-case attached patch. – same as reported by Heikki
 
 fixed patch is in attachment
 
  After modifications:
  ---
  - Patch applies OK
  - Compiles cleanly without any errors/warnings
  - Regression tests pass.
 
 
  What it does:
  --
  Modification to get the number of processed rows evaluated via SPI.
  The changes are to add extra parameter in ProcessUtility to get the
  number of rows processed by COPY command.
 
  Code Review Comments:
  -
  1. New parameter is added to ProcessUtility_hook_type function
 but the functions which get assigned to these functions like
 sepgsql_utility_command, pgss_ProcessUtility, prototype 
  definition is not modified.
 
  Functionality is not fixed correctly for hook functions, In function
  pgss_ProcessUtility for bellow snippet of code processed parameter is
 passed NULL, as well as not initialized.
  because of this when pg_stat_statements extention is utilized COPY
 command is giving garbage values.
  if (prev_ProcessUtility)
  prev_ProcessUtility(parsetree, queryString, params,
  dest,
 completionTag, context, NULL);
  else
  standard_ProcessUtility(parsetree, queryString,
 params,
  dest,
  completionTag, context, NULL);
 
  Testcase is attached.
  In this testcase table has only 1000 records but it show garbage
 value.
  postgres=# show shared_preload_libraries ;
   shared_preload_libraries
  --
   pg_stat_statements
  (1 row)
  postgres=# CREATE TABLE tbl (a int);
  CREATE TABLE
  postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
  INSERT 0 1000
  postgres=# do $$
  declare r int;
  begin
copy tbl to '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'exported % rows', r;
truncate tbl;
copy tbl from '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'imported % rows', r;
  end;
  $$ language plpgsql;
  postgres$#
  NOTICE:  exported 13281616 rows
  NOTICE:  imported 13281616 rows
  DO
 
 
  2. Why to add the new parameter if completionTag hold the number of
  processed tuple information; can be extracted
 
 from it as follows:
  _SPI_current-processed = strtoul(completionTag
  + 7, NULL, 10);
 
 this is basic question. I prefer a natural type for counter - uint64
 instead text. And there are no simply way to get offset (7 in this
 case)
 
  I agree with your point, but currently in few other places we are
  parsing the completion tag for getting number of tuples processed. So
  may be in future we can change those places as well. For example

 yes, this step can be done in future - together with libpq enhancing.
 Is not adequate change API (and break lot of extensions) for this
 isolated problem. So I propose fix plpgsql issue, and add to ToDo -
 enhance libpq to return a number of processed rows as number - but
 this change breaking compatibility.

 Pavel

 
  pgss_ProcessUtility
  {
  ..
 
  /* parse command tag to retrieve the number of affected rows. */ if
  (completionTag 
  sscanf(completionTag, COPY  UINT64_FORMAT, rows) != 1)
  rows = 0;
 
  }
 
  _SPI_execute_plan
  {
  ..
  ..
  if (IsA(stmt, CreateTableAsStmt))
  {
  Assert(strncmp(completionTag, SELECT , 7) == 0);
  _SPI_current-processed =