Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Hannu Krosing
On Fri, 2012-05-04 at 15:59 -0400, Robert Haas wrote:
 On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote:
  On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote:
  Hannu Krosing ha...@2ndquadrant.com writes:
   On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
   So given that do we do anything about this now, or wait till 9.3?
 
   I'd like the json support in 9.2 updated as follows
 
  I think it's too late to be entertaining proposals for such changes in
  9.2.  If we had concluded that the existing functions were actively
  wrong or a bad idea,
 
  I think that hard-coding postgresql text representation as our json
  representation without a possibility for the user tio easily fix it
  without rewriting foll xx_to_json() functions is borderline actively
  wrong.
 
  Can we at least have the xxx_to_json() functions try cast to json first
  and fall back to text if the cast fails.
 
 I think the idea that you can involve the casting machinery in this is
 misguided.  sometextval::json has got to mean that sometextval is
 expected to be in the form of a syntactically correct JSON value - and
 NOT that we wrap it in a JSON string.  

PostgreSQL CAST is different from casting a C pointer, they actually
perform a conversion when defined WITH FUNCTION or WITH INOUT. 
And they pass value unchanged when defined WITHOUT FUNCTION.

Casts _do_not_ perform syntax checks, they assume the source to be of
the type castted from and they return result of the target type.

I think you are confusing input/output formatting with CAST here.

This confusion about cast-as-syntax-check probably qualifies as
something being actively wrong with current implementation, though it
does not seem to be something actively defined (there are no casts
defined for json type) but rather an artifact of how postgresql input
works.

CAST is something that should convert one type to another, in this case
a textual type to its json value representation and back.

'sometext'::text::json -- 'sometext'

and 

'sometext'::json::text -- 'sometext'


the suggested syntax check only should only be done by the type i/o
functions json_in(cstring) and json_recv(internal) and not casts.

Casts should do casting, that in PostgreSQL means type conversion


 We can have constructors for
 JSON, but they've got to be separate from the casting machinery.

Currently we do have constructors - json_in(csting) and
json_recv(internal)

These are the ones that should and do  check for correct syntax.


Some more confusiong examples for pondering on cast vs i/o functions


Some of the difficulties of understanding and explaining what a json
type should be are exemplified in the following


hannu=# create table jtest(plaintext text, jsontext json); 
CREATE TABLE

hannu=# insert into jtest values('A','A');
hannu=# insert into jtest values('true','true');
hannu=# insert into jtest values('null','null');
hannu=# insert into jtest values(null,null);

hannu=# select row_to_json(jtest) from jtest;
 row_to_json  
--
 {plaintext:\A\,jsontext:A}
 {plaintext:true,jsontext:true}
 {plaintext:null,jsontext:null}
 {plaintext:null,jsontext:null}
(4 rows)

hannu=# insert into jtest values('a','a');
ERROR:  invalid input syntax for type json
LINE 1: insert into jtest values('a','a');
 ^
DETAIL:  line 1: Token a is invalid.



-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Hannu Krosing
On Fri, 2012-05-04 at 16:12 -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote:
  Can we at least have the xxx_to_json() functions try cast to json first
  and fall back to text if the cast fails.
 
  I think the idea that you can involve the casting machinery in this is
  misguided.

 It is possible that that can be made to work, but it's a research
 project, not something to be crammed into 9.2 at the last possible
 minute.  In any case, I really dislike the idea that array_to_json
 and row_to_json would contain two entirely different behaviors.
 Leave the extensibility ideas for a future to_json() function.

I did not mean that array_to_json and row_to_json would be different
than the generic to_json, just that they would be thin wrappers around
the to_json function which check that the arguments are of the correct
types for casting to JSON text. 

They need to recurse to each other and generic to_json anyway.

   regards, tom lane

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Peter Eisentraut
On fre, 2012-05-04 at 13:43 -0400, Robert Haas wrote:
 For this particular case, I think you just need some place to store a
 pg_type - pg_proc mapping.  I'm not exactly sure how to make that not
 a JSON-specific hack, since I certainly don't think we'd want to add a
 new catalog just for that.

I think you're thinking of pg_cast.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Peter Eisentraut
On fre, 2012-05-04 at 12:30 -0400, Andrew Dunstan wrote:
 Yeah, what I've been thinking about in conjunction with similar
 problems is some sort of type registry, so that we could code for
 non-builtin types in certain cases.

It certainly seems to come up a lot, but I'm not sure whether the two
main use cases -- mapping types into languages, and mapping types into
JSON (or XML, as an alternative) -- would have the same solution.  A
third use case that could be included as well is changing the main text
or binary format of a type (e.g., what datestyle does).  So instead of
having just a hard-coded set of typinput/typoutput, typrecv/typsend,
you'd have a catalog of

(context, inputfunc, outputfunc)

and the context would a language, something json or xml, or a
user-settable value.  This could become quite complicated, but it sure
could solve a lot of issues in one go.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Peter Eisentraut
On fre, 2012-05-04 at 15:59 -0400, Robert Haas wrote:
  Can we at least have the xxx_to_json() functions try cast to json
 first
  and fall back to text if the cast fails.
 
 I think the idea that you can involve the casting machinery in this is
 misguided.  sometextval::json has got to mean that sometextval is
 expected to be in the form of a syntactically correct JSON value - and
 NOT that we wrap it in a JSON string.

I think it's only wrong if you try casting first and fall back to text.
Otherwise it could work, if the set of all json casts is defined
consistently.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Hannu Krosing
On Sat, 2012-05-05 at 12:16 +0300, Peter Eisentraut wrote:
 On fre, 2012-05-04 at 15:59 -0400, Robert Haas wrote:
   Can we at least have the xxx_to_json() functions try cast to json
  first
   and fall back to text if the cast fails.
  
  I think the idea that you can involve the casting machinery in this is
  misguided.  sometextval::json has got to mean that sometextval is
  expected to be in the form of a syntactically correct JSON value - and
  NOT that we wrap it in a JSON string.
 
 I think it's only wrong if you try casting first and fall back to text.
 Otherwise it could work, if the set of all json casts is defined
 consistently.

Currently the default cast for non-number, non-bool, not-already-json,
non-null values is to wrap text representation in double quotes.

So casting first then fall back to _quoted_ text is wrong only for those
types which have a very ugly text representation :)

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-05 Thread Tom Lane
Hannu Krosing ha...@2ndquadrant.com writes:
 CAST is something that should convert one type to another, in this case
 a textual type to its json value representation and back.
 'sometext'::text::json -- 'sometext'
 and 
 'sometext'::json::text -- 'sometext'

Well, that's a pretty interesting example, because if you expect that to
work like that, then what should happen with this?

'{f1:4,f2:2}'::text::json
'{f1:4,f2:2}'::json::text

Does the first of these produce a JSON object, or a quoted string?
At the moment you get the former, but it's pretty hard to argue that
it shouldn't produce a quoted string if transparent conversion is the
expectation.  In the second case, do you end up with a JSON text (which
is what happens at the moment) or does it just slap some quotes around
the value as a string?  I'm not convinced that you've made a principled
argument as to what should happen when.

In general, I think casts should only be used for conversions where
there is just one unsurprising choice of behavior, since the cast syntax
by definition doesn't provide any room for options.  It's not clear to
me that JSON conversions are so obvious as to meet that standard.  If
you start throwing random user-defined conversions into the mix, it's
even less obvious that there's only one unsurprising choice.

In the particular case of casts to and from text, we've essentially
set a project policy that those should behave equivalently to the type's
I/O conversion functions whenever possible.  So I think the existing
behavior of those operations is correct and what you propose above is
wrong.  There is certainly scope for a conversion function that takes
any random text string and produces a JSON quoted string from it, but
the cast operator is not the place for that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Hannu Krosing
On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
 
 
 On Wed, May 2, 2012 at 2:29 AM, Hannu Krosing ha...@2ndquadrant.com
 wrote:
 
 
 
 I don't object to row_to_json() and array_to_json() functions
 being
 there as a convenience and as the two official functions
 guaranteed to
 return JSON text.
 
 
 So given that do we do anything about this now, or wait till 9.3?

Sorry for missing this mail, followed this only on list
I hope it is ok to CC this back to list



I'd like the json support in 9.2 updated as follows


Generic to_json(...) returning a JSON value
=

we should have a generic to_json(...) both for eas and use and for easy
extensibility, as explained below.

to_json(...) should work for all types, returning a json value similar
to what current json_in does, but for all types, not just cstring.

We could keep row_to_json() and array_to_json() as official json-text
returning functions 


Configurable and extensible to_json()
==

When working on structured types, always the first try for getting an
element-as-json should be running to_json(element) and only if this
fails fall back to current use text representation code.

this enables two important things

1) configurable json-coding of values not explicitly supported by
standard

You can read about an attempt to standardise json-date formart here

http://weblogs.asp.net/bleroy/archive/2008/01/18/dates-and-json.aspx .

By allowing developers just to define their own to_json(date) function
we give them the power do decide which one to use. And if we honour
search_path when looking up the to_json() functions, then they can even
choose to have different conventions for different applications.

2) flexibility in adding support for extension types, like representing
hstore as object/dict by just providing the to_json(hstore, ...)
functions in hstore extension

Pretty-printing
===

If we were to support prettyprinting of anything more complex than
single level structs (record or array), then we need to pass ident
into the to_json() function

my recommendation would be to have the signature 

to_json(datum any, ident int)

with ident = NULL meaning no prettyprint , ident =0 meaninf top level,
or starting at left margin and anything else meaning the amount of
spaces needed to be added to the beginning of all rows exept the first
one, for example the query 

hannu=# select array_to_json(array(select test from test limit 2),true);
array_to_json
-
 [{id:9,data:testdata,tstamp:2012-05-01 09:44:50.175189}, +
  {id:10,data:testdata,tstamp:2012-05-01 09:45:50.260276}]
(1 row)

could return this:

[{'data': 'testdata',
  'id': 9,
  'tstamp': '2012-05-01 09:44:50.175189'},
 {'data': 'testdata',
  'id': 10,
  'tstamp': '2012-05-01 09:45:50.260276'}]

if it would call to_json(row, 1) for getting each row prettyprinted with
ident 1


Getting a record _from_ json()
==

JSON support would be much more useful if we supported the function of
converting the other way as well, that is from json to record

The best model easy to support seems to be what Merlin suggested, that
is the populate_record(record,hstore) function and corresponding #=
operator (see: http://www.postgresql.org/docs/9.1/static/hstore.html)



The complete plan for 9.2 once more
===

What is needed to nicely round up a  simple and powerful json type is

1. the json TYPE

   an agreement tha json type represents a JSON value (this is what 
   json_in currently does)


2. json output

2.1 doing the right thing with internal values

   to_json() function for converting to this JSON value for any type.
   default implementation for 'non-json' types returns their 
   postgresql textual representation in double quotes 
(date - 2012-05-01 09:45:50.260276

   structured types use to_json() for getting values internally, 
   so that by defining to_json(hstore) you can automatically get 
   hstore represented in javascript object or dictionary representation

   hannu=# select row_to_json(r) 
   from (select 1::int id, 'foo=1, bar=baz'::hstore)r;

   should not return this:

row_to_json  
   --
{id:1,hstore:\bar\=\baz\, \foo\=\1\}
   (1 row)

   but this
row_to_json  
   --
{id:1,hstore:{bar: baz, foo:1}}
   (1 row)

2.1 getting the pretty-printing right for structured types 

   to_json(any, indent) functions for getting recursive prettyprinting.
   we might also need another argument telling the page width we want
   to pretty print to.


3. json input 

Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Tom Lane
Hannu Krosing ha...@2ndquadrant.com writes:
 On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
 So given that do we do anything about this now, or wait till 9.3?

 I'd like the json support in 9.2 updated as follows

I think it's too late to be entertaining proposals for such changes in
9.2.  If we had concluded that the existing functions were actively
wrong or a bad idea, then of course we'd need to do something; but they
are not, so we can just as well consider additions in the 9.3 cycle
rather than now.  I am not convinced that this proposal is fully baked
yet, anyway; not to mention that right now we need to have our heads
down on resolving the remaining open issues, not designing,
implementing, and reviewing a pile of brand new code for json.

 By allowing developers just to define their own to_json(date) function
 we give them the power do decide which one to use. And if we honour
 search_path when looking up the to_json() functions, then they can even
 choose to have different conventions for different applications.

This is not going to work anywhere near as nicely as you think.  If
somebody tries to define multiple to_json() functions that override a
generic to_json(anyelement) one, he will start getting function is not
unique parse failures.  The parser will only successfully decide which
function to call when the input data type exactly matches one of the
specialized functions, which means you might as well not have the
generic one at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Andrew Dunstan



On 05/04/2012 09:52 AM, Tom Lane wrote:

Hannu Krosingha...@2ndquadrant.com  writes:

On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:

So given that do we do anything about this now, or wait till 9.3?

I'd like the json support in 9.2 updated as follows

I think it's too late to be entertaining proposals for such changes in
9.2.  If we had concluded that the existing functions were actively
wrong or a bad idea, then of course we'd need to do something; but they
are not, so we can just as well consider additions in the 9.3 cycle
rather than now.  I am not convinced that this proposal is fully baked
yet, anyway; not to mention that right now we need to have our heads
down on resolving the remaining open issues, not designing,
implementing, and reviewing a pile of brand new code for json.



Yeah, that was my feeling. We usually take a release or two to get 
things right, fill in what's missing, etc. and I don't think this will 
be ant different.






By allowing developers just to define their own to_json(date) function
we give them the power do decide which one to use. And if we honour
search_path when looking up the to_json() functions, then they can even
choose to have different conventions for different applications.

This is not going to work anywhere near as nicely as you think.  If
somebody tries to define multiple to_json() functions that override a
generic to_json(anyelement) one, he will start getting function is not
unique parse failures.  The parser will only successfully decide which
function to call when the input data type exactly matches one of the
specialized functions, which means you might as well not have the
generic one at all.




Yeah, what I've been thinking about in conjunction with similar problems 
is some sort of type registry, so that we could code for non-builtin 
types in certain cases. Maybe we should add that the the developers' 
meeting agenda.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Yeah, what I've been thinking about in conjunction with similar problems 
 is some sort of type registry, so that we could code for non-builtin 
 types in certain cases. Maybe we should add that the the developers' 
 meeting agenda.

Maybe.  I don't want to see a json-specific hack for this, but some sort
of generic way to add type knowledge could be useful, if we could figure
out what we want.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Robert Haas
On Fri, May 4, 2012 at 12:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 Yeah, what I've been thinking about in conjunction with similar problems
 is some sort of type registry, so that we could code for non-builtin
 types in certain cases. Maybe we should add that the the developers'
 meeting agenda.

 Maybe.  I don't want to see a json-specific hack for this, but some sort
 of generic way to add type knowledge could be useful, if we could figure
 out what we want.

For this particular case, I think you just need some place to store a
pg_type - pg_proc mapping.  I'm not exactly sure how to make that not
a JSON-specific hack, since I certainly don't think we'd want to add a
new catalog just for that.

In general, I think it would be very useful to have some way of
identifying particular types - and versions of types - independently
of a particular installation - e.g. by assigning each type a UUID that
never changes and a version number that we bump when we change
something about that type.  That seems like it would be very useful
for schema comparison tools, or for logical replication, where you
want to know whether two types are the same type even though they
are in different clusters.  pg_upgrade has had past needs in this area
as well.  However, I'm not sure that'd help solve this particular
problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Hannu Krosing
On Fri, 2012-05-04 at 13:43 -0400, Robert Haas wrote:
 On Fri, May 4, 2012 at 12:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Andrew Dunstan and...@dunslane.net writes:
  Yeah, what I've been thinking about in conjunction with similar problems
  is some sort of type registry, so that we could code for non-builtin
  types in certain cases. Maybe we should add that the the developers'
  meeting agenda.
 
  Maybe.  I don't want to see a json-specific hack for this, but some sort
  of generic way to add type knowledge could be useful, if we could figure
  out what we want.
 
 For this particular case, I think you just need some place to store a
 pg_type - pg_proc mapping.  I'm not exactly sure how to make that not
 a JSON-specific hack, since I certainly don't think we'd want to add a
 new catalog just for that.

This was my initial proposal to have casts to ::json for all types.

I backed out from this in favot of generic to_json(datum, indent) in
order to support prettyprinting.

 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Hannu Krosing
On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote:
 Hannu Krosing ha...@2ndquadrant.com writes:
  On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
  So given that do we do anything about this now, or wait till 9.3?
 
  I'd like the json support in 9.2 updated as follows
 
 I think it's too late to be entertaining proposals for such changes in
 9.2.  If we had concluded that the existing functions were actively
 wrong or a bad idea,

I think that hard-coding postgresql text representation as our json
representation without a possibility for the user tio easily fix it
without rewriting foll xx_to_json() functions is borderline actively
wrong.

Can we at least have the xxx_to_json() functions try cast to json first
and fall back to text if the cast fails.

This would address my worst problem, all the rest can be easily defined
in user functions.

  then of course we'd need to do something; but they
 are not, so we can just as well consider additions in the 9.3 cycle
 rather than now.  I am not convinced that this proposal is fully baked
 yet, anyway; not to mention that right now we need to have our heads
 down on resolving the remaining open issues, not designing,
 implementing, and reviewing a pile of brand new code for json.
 
  By allowing developers just to define their own to_json(date) function
  we give them the power do decide which one to use. And if we honour
  search_path when looking up the to_json() functions, then they can even
  choose to have different conventions for different applications.
 
 This is not going to work anywhere near as nicely as you think.  If
 somebody tries to define multiple to_json() functions that override a
 generic to_json(anyelement) one, he will start getting function is not
 unique parse failures.  The parser will only successfully decide which
 function to call when the input data type exactly matches one of the
 specialized functions, which means you might as well not have the
 generic one at all.
 
   regards, tom lane
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Robert Haas
On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote:
 On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote:
 Hannu Krosing ha...@2ndquadrant.com writes:
  On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
  So given that do we do anything about this now, or wait till 9.3?

  I'd like the json support in 9.2 updated as follows

 I think it's too late to be entertaining proposals for such changes in
 9.2.  If we had concluded that the existing functions were actively
 wrong or a bad idea,

 I think that hard-coding postgresql text representation as our json
 representation without a possibility for the user tio easily fix it
 without rewriting foll xx_to_json() functions is borderline actively
 wrong.

 Can we at least have the xxx_to_json() functions try cast to json first
 and fall back to text if the cast fails.

I think the idea that you can involve the casting machinery in this is
misguided.  sometextval::json has got to mean that sometextval is
expected to be in the form of a syntactically correct JSON value - and
NOT that we wrap it in a JSON string.  We can have constructors for
JSON, but they've got to be separate from the casting machinery.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote:
 Can we at least have the xxx_to_json() functions try cast to json first
 and fall back to text if the cast fails.

 I think the idea that you can involve the casting machinery in this is
 misguided.

It is possible that that can be made to work, but it's a research
project, not something to be crammed into 9.2 at the last possible
minute.  In any case, I really dislike the idea that array_to_json
and row_to_json would contain two entirely different behaviors.
Leave the extensibility ideas for a future to_json() function.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread Hannu Krosing
On Fri, 2012-05-04 at 15:59 -0400, Robert Haas wrote:
 On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote:
  On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote:
  Hannu Krosing ha...@2ndquadrant.com writes:
   On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
   So given that do we do anything about this now, or wait till 9.3?
 
   I'd like the json support in 9.2 updated as follows
 
  I think it's too late to be entertaining proposals for such changes in
  9.2.  If we had concluded that the existing functions were actively
  wrong or a bad idea,
 
  I think that hard-coding postgresql text representation as our json
  representation without a possibility for the user tio easily fix it
  without rewriting foll xx_to_json() functions is borderline actively
  wrong.
 
  Can we at least have the xxx_to_json() functions try cast to json first
  and fall back to text if the cast fails.
 
 I think the idea that you can involve the casting machinery in this is
 misguided.  sometextval::json has got to mean that sometextval is
 expected to be in the form of a syntactically correct JSON value - and
 NOT that we wrap it in a JSON string.  We can have constructors for
 JSON, but they've got to be separate from the casting machinery.

on the contrary - the string representation of textual value a is a

casting should _not_ neam syntax check, casting is by definition a
conversion.

if we cast text to int, we return value of type int , if we cast int to
numeric(5,2) we return value of type numeric(5,2)

why should casring to json work differntly ?

 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-04 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Hannu Krosing
 Sent: Friday, May 04, 2012 4:40 PM
 To: Robert Haas
 Cc: Tom Lane; Andrew Dunstan; PostgreSQL-development; Merlin Moncure
 Subject: Re: [HACKERS] JSON in 9.2 - Could we have just one to_json()
 function instead of two separate versions ?
 
 On Fri, 2012-05-04 at 15:59 -0400, Robert Haas wrote:
  On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net
 wrote:
   On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote:
   Hannu Krosing ha...@2ndquadrant.com writes:
On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote:
So given that do we do anything about this now, or wait till 9.3?
  
I'd like the json support in 9.2 updated as follows
  
   I think it's too late to be entertaining proposals for such changes
   in 9.2.  If we had concluded that the existing functions were
   actively wrong or a bad idea,
  
   I think that hard-coding postgresql text representation as our
   json representation without a possibility for the user tio easily
   fix it without rewriting foll xx_to_json() functions is borderline
   actively wrong.
  
   Can we at least have the xxx_to_json() functions try cast to json
   first and fall back to text if the cast fails.
 
  I think the idea that you can involve the casting machinery in this is
  misguided.  sometextval::json has got to mean that sometextval is
  expected to be in the form of a syntactically correct JSON value - and
  NOT that we wrap it in a JSON string.  We can have constructors for
  JSON, but they've got to be separate from the casting machinery.
 
 on the contrary - the string representation of textual value a is a
 
 casting should _not_ neam syntax check, casting is by definition a conversion.
 
 if we cast text to int, we return value of type int , if we cast int to
 numeric(5,2) we return value of type numeric(5,2)
 
 why should casring to json work differntly ?
 

What is the distinction between what you are thinking regarding JSON and this 
example?

SELECT '1a'::integer;
SQL Error: ERROR:  invalid input syntax for integer: 1a
LINE 1: SELECT '1a'::integer

As a user if I cast something to something else I want the result to be of the 
correct type and deterministic; otherwise throw me some kind of invalid input 
format exception (or syntax exception).  Casting vs. Constructors is really a 
meaningless distinction to a lay person.  When I cast I do so by constructing a 
new value using my existing value for input.  When I use an explicit CAST I am 
unable to supply additional parameters to configure the casting whereas a 
constructor function gives me that possibility.  But a constructor function 
without any additional parameters is not semantically different than a cast.

I guess the concern to address is something like:

SELECT '{key: value}'::json OR SELECT '[1.25]'::json;  Do you interpret this as 
already being valid JSON and thus output object/array constructs (i.e., JSON 
Text) or do you treat them as string literals and output scalars (i.e., JSON 
Value).  Even if you feel these are artificial constructs the concepts holds 
that there may be ambiguous data that can be interpreted in multiple ways (this 
applies even to function forms, though in the function form you could specify 
which one you want to output using a separate DEFAULTed parameter). 



I can see the primary use-case for JSON Value casting as being queries of the 
following forms (since the record and array forms are going through the 
record/array_to_json function):

SELECT COUNT(*)::json FROM table [WHERE ...];
SELECT single_col::json FROM table WHERE id = ?;

Where the single provided value can be sent directly back to the web-caller 
JavaScript and used as-is because it is valid JSON.  Though, honestly, both 
SELECT to_json(single_col) and SELECT to_json(COUNT(*)) are equally usable so 
any distinction between them is a pure technical issue to me.

Am I correct in assuming the following expected behavior (the forgive the 
blatantly wrong syntax but you should get the point)?

RAISE NOTICE '%', SELECT 'A'::text  =  A
RAISE NOTICE '%', SELECT 'A'::json = A

David J



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-02 Thread Hannu Krosing
On Tue, 2012-05-01 at 21:22 -0400, David Johnston wrote:
 On May 1, 2012, at 20:41, Hannu Krosing ha...@2ndquadrant.com wrote:
  
  Most people don't work in strongly-typed environment, and thus would
  work around such restriction if they need a simple JSON value at the
  other end of the interchange.
  
  
  My personal take it is have it fail since any arbitrary decision to cast 
  to JSON Text

For arrays and records the json text and jason value are exactly the
same. it is just that json representations of simple types are
officially not JSON texts.

  is going to make someone unhappy and supposedly they can 
  modify their query so that the result generates whatever format they 
  desire.
  
  Do you actually have such an experience or is it just a wild guess ?
  
  
 
 So even given the semantic differences between an object and a scalar 
 I am better understanding where interpreting JSON as JSON Value makes 
 sense.  However, if I convert a record or array to JSON I expect to get 
 a JSON Text even if the there is only a single column or value in the input.  

Of course you will, and you will get a Json Text even for empty object
or array. 

array[1] and 1 and {'one':1} are all different and will stay such.

 I guess my take is that record - JSON text while anything else is JSON 
 value.  Whether it is worth maiming the special case for record is 
 worthwhile I really do not know but the semantic difference does exist; 
 and record output is a significant aspect of PostgreSQL output.

I have never suggested that we special-case an 1-element record or list
and start returning only the contained value for these.

 I get the ease-of-use aspect but also recognize that sometimes being slightly 
 harder to use is worthwhile if you eliminate ambiguities or limit the 
 possibility to make mistakes.

There are no ambiguities in what is returnded for record or array.

But not being able to return JSON values via cast to json for some types
or not using such casts will make extending the json support for types
by user much much harder. And nonstandard. 

Using simple cast to json is very PostgreSQL-ish way to give support of
json to any type



-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-02 Thread Hannu Krosing
On Tue, 2012-05-01 at 19:11 -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  On Tue, May 1, 2012 at 9:56 AM, Joey Adams 
  joeyadams3.14...@gmail.comwrote:
  No, the RFC says (emphasis mine):
  
  A JSON *text* is a serialized object or array.
  
  If we let the JSON type correspond to a *value* instead, this
  restriction does not apply, and the JSON type has a useful recursive
  definition.
 
  I think you're playing with words. But in any case, the RFC says this
  regarding generators:
  5. Generators
 A JSON generator produces JSON text.  The resulting text MUST
 strictly conform to the JSON grammar.
 
 I read over the RFC, and I think the only reason why they restricted
 JSON texts to represent just a subset of JSON values is this cute
 little hack in section 3 (Encoding):
 
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
00 00 00 xx  UTF-32BE
00 xx 00 xx  UTF-16BE
xx 00 00 00  UTF-32LE
xx 00 xx 00  UTF-16LE
xx xx xx xx  UTF-8
 
 They need a guaranteed 2 ASCII characters to make that work, and
 they won't necessarily get that many with a bare string literal.
 
 Since for our purposes there is not, and never will be, any need to
 figure out whether a JSON input string is encoded in UTF16 or UTF32,
 I find myself agreeing with the camp that says we might as well consider
 that our JSON type corresponds to JSON values not JSON texts.  I also
 notice that json_in() seems to believe that already.
 
 However, that doesn't mean I'm sold on the idea of getting rid of
 array_to_json and row_to_json in favor of a universal to_json()
 function.  In particular, both of those have optional pretty_bool
 arguments that don't fit nicely at all in a generic conversion
 function.  The meaning of that flag is very closely tied to the
 input being an array or record respectively.

The flags probably should not be tied to specific type, as JSON is
recursive and as such I think the current one-top-level-element-per row
is quite limited form of pretty-printing.

I have a table with a field the type of which is an array of type of
another table, and what I currently get with pretty=true is

hannu=# select row_to_json(test3, true) from test3;
-[ RECORD
1 
]
row_to_json | {id:1,
|
data3:[{id:1,data2:{id:1,data:0.262814193032682,tstamp:2012-04-05
 13:21:03.235204},tstamp:2012-04-05 
13:25:03.644497},{id:2,data2:{id:2,data:0.157406373415142,tstamp:2012-04-05
 13:21:05.2033},tstamp:2012-04-05 13:25:03.644497}],
|  tstamp:2012-04-16 14:40:15.795947}

What I would like to get what python's pprint does for the same json:

 pprint(row)
{'id': 1,
 'data3': [{'data2': {'data': '0.262814193032682',
  'id': 1,
  'tstamp': '2012-04-05 13:21:03.235204'},
'id': 1,
'tstamp': '2012-04-05 13:25:03.644497'},
   {'data2': {'data': '0.157406373415142',
  'id': 2,
  'tstamp': '2012-04-05 13:21:05.2033'},
'id': 2,
'tstamp': '2012-04-05 13:25:03.644497'}],
 'tstamp': '2012-04-16 14:40:15.795947'}

If we have a pretty flag why not make it work all the way down the
structure ?

 I'm inclined to leave these functions as they are, and consider
 adding a universal to_json(anyelement) (with no options) later.

To achieve recursive prettyprinting the better way is to have an
universal to_json(anyelement) with a prettyprinting option 

to_json(datum anyelement, indent int)

with the behavior that if indent is NULL or negative integer no
pretty-printing is done, if it is 0 printing starts at left margin and
if it is a positive integer then this number of spaces is added to the
left for each row (except the first one) of the json representation.

And it would be overridable for specific types, so that hstore could
provide its own

to_json(datum hstore, indent int)

which would do the correct pretty-printing for hstor-as-json_object
representation.

 Because it would not have options, it would not be meant to cover
 cases where there's value in formatting or conversion options;
 so it wouldn't render the existing functions entirely obsolete,
 nor would it mean there would be no need for other specialized
 conversion functions.

I don't object to row_to_json() and array_to_json() functions being
there as a convenience and as the two official functions guaranteed to
return JSON text.

   regards, tom lane

-- 
---

Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Merlin Moncure
On Tue, May 1, 2012 at 7:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Hi hackers

 After playing around with array_to_json() and row_to_json() functions a
 bit it I have a question - why do we even have 2 variants *_to_json()

 Collapsing array_to_json() and row_to_json() into just to_json()

I asked the same question.  It was noted that the xml functions aren't
overloaded like that and that it's cleaner to introduce datum specific
behaviors if you don't overload.

I don't really agree with that or any of the naming styles that are in
the form inputtype_func() but I think most people are on the other
side of the argument.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 08:18 -0500, Merlin Moncure wrote:
 On Tue, May 1, 2012 at 7:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
  Hi hackers
 
  After playing around with array_to_json() and row_to_json() functions a
  bit it I have a question - why do we even have 2 variants *_to_json()
 
  Collapsing array_to_json() and row_to_json() into just to_json()
 
 I asked the same question.  It was noted that the xml functions aren't
 overloaded like that and that it's cleaner to introduce datum specific
 behaviors if you don't overload.

XML, being an enterprise thing is a large and complex beast.

Javascript - and by extension json - comes from the other end, being
lightweight and elegant at core. 

Also, the the *_to_xml functions present still don't match what is there
for json, they don't even overlap !

Thus I see no reason why deciding on how to_json() functions (or cast to
json) should work needs to be based on how xml works.

We currently don't have any of the database_to_json() or
querystring_to_json() and we don't need these either. 

I'd be much more happy by just having a working cast to json from all
types, not a myriad of functions for all possible types -
int4_to_json(), text_to_json(), bool_to_json(), record_to_json(),
array_to_json(), pg_user_to_json, etc. etc. etc.


What we currently have exposed to userspace are two arbitrarily chosen
compex type functions - 

array_to_json() for converting arrays of ANY element type to json ,
inluding arrays consisting of records which may again contain arrays and
records.

and

row_to_json() for converting rows again potentially consisting of ANY
TYPE, including arrays of any type and any complex type. It handles even
the row() type :)

hannu=# select row_to_json(row(1,2,3));
  row_to_json   

 {f1:1,f2:2,f3:3}
(1 row)


What we currently lack is direct conversion for simple types, though
they are easily achieved by converting to a single-element array and
then stripping outer [] from the result 


It would be really nice to also have the casts from json to any type,
including records though.

And perhaps one functions for converting schema elements to some json
representation, so that a json_dump could easily be constructed :)

We really do not need footguns similar to database_to_xml() or
schema_to_xml() which just to consume all memory in the server on any
real database.

 I don't really agree with that or any of the naming styles that are in
 the form inputtype_func() but I think most people are on the other
 side of the argument.

I think that most people have not given this any thought yet, so they
simply lack any reasoned opinion ;)

 merlin

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Joey Adams
On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Hi hackers

 After playing around with array_to_json() and row_to_json() functions a
 bit it I have a question - why do we even have 2 variants *_to_json()

Here's the discussion where that decision was made:

http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php

To quote:

 why not call all these functions 'to_json' and overload them?

 I don't honestly feel that advances clarity much. And we might want to 
 overload each at some stage with options that are specific to the datum 
 type. We have various foo_to_xml() functions now.

 -1

 older proposal is more consistent with xml functions

The most compelling argument I see here is the one about options
specific to the datum type.

Two other reasons I can think of:

 * If someone tries to google for how to convert an array to JSON,
having a function named 'array_to_json' will make that easier.

 * If the JSON type does not yet support, say, converting from a
number, it will be apparent from the names and types of the functions,
rather than being a hidden surprise.  On the other hand, array_to_json
and composite_to_json already convert ANY values to JSON, so this
doesn't matter, anyway.

On Tue, May 1, 2012 at 11:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 What we currently lack is direct conversion for simple types, though
 they are easily achieved by converting to a single-element array and
 then stripping outer [] from the result

I agree that this function ought to be exposed.  Note that such a
function (perhaps called datum_to_json) is indeed the same as the
proposed to_json function, which tries to convert a value of any type
to JSON.

 It would be really nice to also have the casts from json to any type,
 including records though.

What the casts currently do (primarily) is convert between the TEXT
and JSON types.  So if you have JSON-encoded TEXT, use a cast to
convert it to the JSON type (this will perform validation, ensuring
that no invalid JSON gets in).  Any escape/unescape operations need to
be explicit.

-Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Merlin Moncure
On Tue, May 1, 2012 at 10:49 AM, Joey Adams joeyadams3.14...@gmail.com wrote:
 On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Hi hackers

 After playing around with array_to_json() and row_to_json() functions a
 bit it I have a question - why do we even have 2 variants *_to_json()

 Here's the discussion where that decision was made:

 http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php

 To quote:

 why not call all these functions 'to_json' and overload them?

 I don't honestly feel that advances clarity much. And we might want to 
 overload each at some stage with options that are specific to the datum 
 type. We have various foo_to_xml() functions now.

 -1

 older proposal is more consistent with xml functions

 The most compelling argument I see here is the one about options
 specific to the datum type.

I don't find that to be particularly compelling at all.  to_timestamp
for example supports multiple argument versions depending on the input
type.

  * If the JSON type does not yet support, say, converting from a
 number, it will be apparent from the names and types of the functions,
 rather than being a hidden surprise.  On the other hand, array_to_json
 and composite_to_json already convert ANY values to JSON, so this
 doesn't matter, anyway.

I don't see how not having to_json(type) is any less surprising than
type_to_json().

To add:
Are we going to have json_length()?  Why shouldn't length operate
directly on the json type since it has a length?  Or are we going to
force an implicit cast to text?

An elementary point of generic programming through SQL is that you are
supposed to keep *what you are trying to do* decoupled from *what
you're doing it on*.  It allows for very natural and terse
programming.  The array, xml, and now the json apis essentially
violate this principle.  The array api I find particularly galling
since you end up having to retype 'array' N times in a single
expression.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 11:49 -0400, Joey Adams wrote:
 On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
  Hi hackers
 
  After playing around with array_to_json() and row_to_json() functions a
  bit it I have a question - why do we even have 2 variants *_to_json()
 
 Here's the discussion where that decision was made:
 
 http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php

Thanks, will read it!

 To quote:
 
  why not call all these functions 'to_json' and overload them?
 
  I don't honestly feel that advances clarity much. And we might want 
 to overload each at some stage with options that are specific to the 
 datum type. We have various foo_to_xml() functions now.
 
  -1
 
  older proposal is more consistent with xml functions
 
 The most compelling argument I see here is the one about options
 specific to the datum type.
 
 Two other reasons I can think of:
 
  * If someone tries to google for how to convert an array to JSON,
 having a function named 'array_to_json' will make that easier.

Well, if you want to know how to convert an integer to string, you don't
use integer_to_text() function. you just use a working cast.

and here it is an outright lie:

hannu=# select 1::json;
ERROR:  cannot cast type integer to json
LINE 1: select 1::json;

the error should be won't cast type integer to json :)

It very well _can_ convert it, as it does it without a problem when such
integer is inside an array or a record type.

  * If the JSON type does not yet support, say, converting from a
 number, it will be apparent from the names and types of the functions,
 rather than being a hidden surprise.  On the other hand, array_to_json
 and composite_to_json already convert ANY values to JSON, so this
 doesn't matter, anyway.

By this logic all non-working casts are hidden surprises

 On Tue, May 1, 2012 at 11:02 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
  What we currently lack is direct conversion for simple types, though
  they are easily achieved by converting to a single-element array and
  then stripping outer [] from the result
 
 I agree that this function ought to be exposed.  Note that such a
 function (perhaps called datum_to_json) is indeed the same as the
 proposed to_json function, which tries to convert a value of any type
 to JSON.

Hmm, I just have found an answer to my question on how to add to_json()
capability to extension types .

If whe had to_json as a cast, it would probably be straightforward for
extensions like hstore to provide their own to_json casts - especially
now that the json type is in core - and we could get a working hstore
-- json conversion by just running the 

CREATE EXTENSION hstore;

command.

  It would be really nice to also have the casts from json to any type,
  including records though.
 
 What the casts currently do (primarily) is convert between the TEXT
 and JSON types.  So if you have JSON-encoded TEXT, use a cast to
 convert it to the JSON type (this will perform validation, ensuring
 that no invalid JSON gets in).  Any escape/unescape operations need to
 be explicit.
 
 -Joey

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Andrew Dunstan
On Tue, May 1, 2012 at 9:05 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Tue, May 1, 2012 at 10:49 AM, Joey Adams joeyadams3.14...@gmail.com
 wrote:
  On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing ha...@2ndquadrant.com
 wrote:
  Hi hackers
 
  After playing around with array_to_json() and row_to_json() functions a
  bit it I have a question - why do we even have 2 variants *_to_json()
 
  Here's the discussion where that decision was made:
 
  http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php
 
  To quote:
 
  why not call all these functions 'to_json' and overload them?
 
  I don't honestly feel that advances clarity much. And we might want to
 overload each at some stage with options that are specific to the datum
 type. We have various foo_to_xml() functions now.
 
  -1
 
  older proposal is more consistent with xml functions
 
  The most compelling argument I see here is the one about options
  specific to the datum type.

 I don't find that to be particularly compelling at all.  to_timestamp
 for example supports multiple argument versions depending on the input
 type.

   * If the JSON type does not yet support, say, converting from a
  number, it will be apparent from the names and types of the functions,
  rather than being a hidden surprise.  On the other hand, array_to_json
  and composite_to_json already convert ANY values to JSON, so this
  doesn't matter, anyway.



I am away from base on a consulting assignment all this week, so my
connectivity and time are severely limited, and I don't have time to
respond in depth.

Let me just point out two things. First, we are approaching a beta release.
The time for changing this is long since gone, IMNSHO.

Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
object or an array, so this thing about converting arbitrary datum values
to JSON is a fantasy. If anything, we should adjust the JSON input routines
to disallow anything else, rather than start to output what is not valid
JSON.


cheers

andrew


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Merlin Moncure
On Tue, May 1, 2012 at 11:22 AM, Andrew Dunstan and...@dunslane.net wrote:


 On Tue, May 1, 2012 at 9:05 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Tue, May 1, 2012 at 10:49 AM, Joey Adams joeyadams3.14...@gmail.com
 wrote:
  On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing ha...@2ndquadrant.com
  wrote:
  Hi hackers
 
  After playing around with array_to_json() and row_to_json() functions a
  bit it I have a question - why do we even have 2 variants *_to_json()
 
  Here's the discussion where that decision was made:
 
  http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php
 
  To quote:
 
  why not call all these functions 'to_json' and overload them?
 
  I don't honestly feel that advances clarity much. And we might want to
  overload each at some stage with options that are specific to the datum
  type. We have various foo_to_xml() functions now.
 
  -1
 
  older proposal is more consistent with xml functions
 
  The most compelling argument I see here is the one about options
  specific to the datum type.

 I don't find that to be particularly compelling at all.  to_timestamp
 for example supports multiple argument versions depending on the input
 type.

   * If the JSON type does not yet support, say, converting from a
  number, it will be apparent from the names and types of the functions,
  rather than being a hidden surprise.  On the other hand, array_to_json
  and composite_to_json already convert ANY values to JSON, so this
  doesn't matter, anyway.



 I am away from base on a consulting assignment all this week, so my
 connectivity and time are severely limited, and I don't have time to respond
 in depth.

 Let me just point out two things. First, we are approaching a beta release.
 The time for changing this is long since gone, IMNSHO.

sure. pedantic philosophical arguments aside, I'm already using the
api heavily and would prefer not to see it changed :-).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Let me just point out two things. First, we are approaching a beta release.
 The time for changing this is long since gone, IMNSHO.

This is our last chance to get it right, so that argument doesn't seem
to me to carry a lot of weight ...

 Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
 object or an array, so this thing about converting arbitrary datum values
 to JSON is a fantasy. If anything, we should adjust the JSON input routines
 to disallow anything else, rather than start to output what is not valid
 JSON.

... but this one does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Joey Adams
On Tue, May 1, 2012 at 12:22 PM, Andrew Dunstan and...@dunslane.net wrote:
 Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
 object or an array, so this thing about converting arbitrary datum values to
 JSON is a fantasy. If anything, we should adjust the JSON input routines to
 disallow anything else, rather than start to output what is not valid JSON.

No, the RFC says (emphasis mine):

A JSON *text* is a serialized object or array.

If we let the JSON type correspond to a *value* instead, this
restriction does not apply, and the JSON type has a useful recursive
definition.

For example, this function would not be possible if we applied the
object or array restriction:

unnest(json) returns setof json

Note that a similar distinction appears with the XML type: document
versus content.

-Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Andrew Dunstan
On Tue, May 1, 2012 at 9:56 AM, Joey Adams joeyadams3.14...@gmail.comwrote:

 On Tue, May 1, 2012 at 12:22 PM, Andrew Dunstan and...@dunslane.net
 wrote:
  Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
  object or an array, so this thing about converting arbitrary datum
 values to
  JSON is a fantasy. If anything, we should adjust the JSON input routines
 to
  disallow anything else, rather than start to output what is not valid
 JSON.

 No, the RFC says (emphasis mine):

A JSON *text* is a serialized object or array.

 If we let the JSON type correspond to a *value* instead, this
 restriction does not apply, and the JSON type has a useful recursive
 definition.

 For example, this function would not be possible if we applied the
 object or array restriction:

unnest(json) returns setof json

 Note that a similar distinction appears with the XML type: document
 versus content.



I think you're playing with words. But in any case, the RFC says this
regarding generators:

5. Generators

   A JSON generator produces JSON text.  The resulting text MUST
   strictly conform to the JSON grammar.

Our functions do seem to be JSON generators. So even if we accept
things that aren't JSON texts in our parser (possibly permitted by
section 4 of the RFC) we should not be generating them.


cheers

andrew


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 12:54 -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  Let me just point out two things. First, we are approaching a beta release.
  The time for changing this is long since gone, IMNSHO.
 
 This is our last chance to get it right, so that argument doesn't seem
 to me to carry a lot of weight ...
 
  Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
  object or an array, 

No, according to RFC, a valid JSON value can be:

  an object, an array, a number, a string, or one of false null true

From RFC:
-
1.  Introduction

   JavaScript Object Notation (JSON) is a text format for the
   serialization of structured data.  It is derived from the object
   literals of JavaScript, as defined in the ECMAScript Programming
   Language Standard, Third Edition [ECMA].

   JSON can represent four primitive types (strings, numbers, booleans,
   and null) and two structured types (objects and arrays).
...

2.1.  Values

   A JSON value MUST be an object, array, number, or string, or one of
   the following three literal names:

  false null true

-

By having our JSON type mean a JSON value instead of JSON
text (which indeed is required to be array or object) we could make it
easy for all extension types to provide casts to JSON value and thus
automatically plug them into postgreSQL's built-in JSON support.

I would very much like this the *_to_array() functions first try a cast
to json when converting values, so that for example after the following
cast it would do the right thing for hstore .

CREATE FUNCTION hstore_to_json(IN hvalue hstore, OUT jvalue json) AS $$
return '{%s}' % hvalue.replace('=',':')
$$ LANGUAGE plpythonu;

CREATE CAST (hstore AS json)
WITH FUNCTION hstore_to_json(hstore)
AS IMPLICIT
;

hannu=# select *, datadict::json from test_hstore;
 id |datadict |datadict 
+-+-
  1 | baz=whatever, foo=bar | {baz:whatever, foo:bar}
  2 | bar=the same, foo=bar | {bar:the same, foo:bar}
(2 rows)

Currently it seems to be hardwired to do datum -- text conversions

hannu=# select row_to_json(test_hstore) from test_hstore;
  row_to_json  
---
 {id:1,datadict:\baz\=\whatever\, \foo\=\bar\}
 {id:2,datadict:\bar\=\the same\, \foo\=\bar\}
(2 rows)

I'd like it to try datum -- json first and yield

hannu=# select row_to_json(test_hstore) from test_hstore;
  row_to_json  
---
 {id:1,datadict:{baz:whatever, foo:bar}}
 {id:2,datadict:{bar:the same, foo:bar}}
(2 rows)

This exact case could be made to work even with JSON text meaning ob
json type, but some other types may not be so lucky. 

FOr example imagine a tri-value booean with textual values yes, no,
and don't know . Logical mapping to json would be true, false, null,
but we can't easily provide a triboolean -- json cast for this if we
require json value to be JSON text and don't accept JSON values

  so this thing about converting arbitrary datum values
  to JSON is a fantasy. 

It should be possible to cast them to JSON value, but not always JSON
text which indeed has to be array or object .

  If anything, we should adjust the JSON input routines
  to disallow anything else, rather than start to output what is not valid
  JSON.

Nah, I'd like us to accept what other JSON parsers usually accept,
especially the original one described in
http://www.json.org/fatfree.html which cited one way to parse json to be

responseData = eval('(' + responseText + ')');

:)

But then I also like their statement when comparing JSON to XML :

JSON has no validator. Being well-formed and valid is not the same as
being correct and relevant. Ultimately, every application is responsible
for validating its inputs. This cannot be delegated. 

 ... but this one does.

It does, _if_ we accept that json type is for JSON text and not JSON
value. in which case we might need also a json_value type for
extensible casting to and from json.

   regards, tom lane

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 11:11 -0700, Andrew Dunstan wrote:
 On Tue, May 1, 2012 at 9:56 AM, Joey Adams
 joeyadams3.14...@gmail.com wrote: 
...
 No, the RFC says (emphasis mine):
 
A JSON *text* is a serialized object or array.
 
 If we let the JSON type correspond to a *value* instead, this
 restriction does not apply, and the JSON type has a useful
 recursive
 definition.
 
 For example, this function would not be possible if we applied
 the
 object or array restriction:
 
unnest(json) returns setof json
 
 Note that a similar distinction appears with the XML type:
 document
 versus content.

 
 I think you're playing with words. But in any case, the RFC says this
 regarding generators:
 
 5. Generators
 
A JSON generator produces JSON text.  The resulting text MUST
strictly conform to the JSON grammar.

I know it is a weak argument, but at least python, ruby and Javascript 
in both Firefox and Chrome do generate JSON values, so our users might
kind of expect us to do the same :

Python
-
hannu@hvost:~$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
[GCC 4.4.3] on linux2
Type help, copyright, credits or license for more information.
 import json
 json.dumps(1)
'1'

Ruby:
---
hannu@hvost:~$ irb
irb(main):001:0 require 'json'
= true
irb(main):002:0 JSON.generate(1)
= 1

Mozilla Javascript
-- 
 JSON.stringify(1) 1 

Chrome Javascript
--
 JSON.stringify(1)
1

 Our functions do seem to be JSON generators. So even if we accept 
 things that aren't JSON texts in our parser (possibly permitted 
 by section 4 of the RFC) we should not be generating them.

we could have a function 

json_generator(json) returns text 

for generating pure JSON texts ;)


the argument for interpreting out json type as JSON value instead of
JSON text would be much bigger flexibility regarding extension types
support for json (and not only extension types, but also the types
explicitly unsupported by json, like Date) via defining casts to and
from json.

From http://en.wikipedia.org/wiki/JSON

-
Unsupported native data types
-
JavaScript syntax defines several native data types not included in the
JSON standard:[7] Date, Error, Math, Regular Expression, and Function.
These JavaScript data types must be represented as some other data
format, with the programs on both ends agreeing on how to convert
between types. As of 2011, there are some de facto standards for e.g.
converting between Date and String, but none universally
recognized.[8][9] Other languages may have a different set of native
types that must be serialized carefully to deal with this type of
conversion.

-

If we allowed user-defined casts for things like Date the out users
could decide, which de facto standard to support on each specific case .

But if we did interpret out json type strictly as JSON text, not JSON
value, this would need another type for json_value.

And it is entirely possible that somebody does want to do what merlin
described recently, that is get a rowset of json values from the
client and wrap them in '[' and ']' on way out, it wuld be shame to
restrict his json array elements to be just objects and arrays and not
the other legal json values.


-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 09:22 -0700, Andrew Dunstan wrote:
 
 
 On Tue, May 1, 2012 at 9:05 AM, Merlin Moncure mmonc...@gmail.com
 wrote:
 On Tue, May 1, 2012 at 10:49 AM, Joey Adams
 joeyadams3.14...@gmail.com wrote:
  On Tue, May 1, 2012 at 8:02 AM, Hannu Krosing
 ha...@2ndquadrant.com wrote:
  Hi hackers
 
  After playing around with array_to_json() and row_to_json()
 functions a
  bit it I have a question - why do we even have 2 variants
 *_to_json()
 
  Here's the discussion where that decision was made:
 
 
 http://archives.postgresql.org/pgsql-hackers/2012-01/msg01339.php
 
  To quote:
 
  why not call all these functions 'to_json' and overload
 them?
 
  I don't honestly feel that advances clarity much. And we
 might want to overload each at some stage with options that
 are specific to the datum type. We have various foo_to_xml()
 functions now.
 
  -1
 
  older proposal is more consistent with xml functions
 
  The most compelling argument I see here is the one about
 options
  specific to the datum type.
 
 
 I don't find that to be particularly compelling at all.
  to_timestamp
 for example supports multiple argument versions depending on
 the input
 type.
 
   * If the JSON type does not yet support, say, converting
 from a
  number, it will be apparent from the names and types of the
 functions,
  rather than being a hidden surprise.  On the other hand,
 array_to_json
  and composite_to_json already convert ANY values to JSON, so
 this
  doesn't matter, anyway.
 
 
 
 I am away from base on a consulting assignment all this week, so my
 connectivity and time are severely limited, and I don't have time to
 respond in depth.
 
 Let me just point out two things. First, we are approaching a beta
 release. The time for changing this is long since gone, IMNSHO.

First, let me start with stating that I am really happy (and a little
amazed and envious ;) ) with what the current to_json functions are
capable of.

It is already way better than what current query_to_xml could do (at
least int 9.1, may have improved since).


hannu=# select row_to_json(z) from (select 1::int as a, (select s from
(select 2::int as x, 2::text as b)s))z;
 row_to_json 
-
 {a:1,s:{x:2,b:2}}
(1 row)

hannu=# select query_to_xml('select 1::int as a, (select s from (select
2::int as x, 2::text as b)s)',true,true,'');
query_to_xml 
-
 row xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;+
+
   a1/a +
   _x003F_column_x003F_(2,2)/_x003F_column_x003F_   +
 /row +
+
(1 row)


The reason I am whining now is that with minor adjustments in
implementation it could all be made much more powerful (try cast
to ::json for values before cast to ::text) and much more elegant thanks
to PostgreSQL's built in casting.

If we allowed json to hold any JSON value and tried ::json when
generating json for compound types than we would be able to claim that
PostgreSQL supports JSON everywhere, defaulting to representing
officially unsupported types as strings, but allowing users to convert
these to their preferred conventions.

I'd also prefer to have default conversions already included for some of
our sexier types, like intervals (just a two element array) and hstore
(an object) etc.

Suddenly we would be the best match database for Web development and all
things Ajax and also have a widely used built in and adjustable
interchange format to outer world.

 Second, RFC 4627 is absolutely clear: a valid JSON value can only be
 an object or an array, so this thing about converting arbitrary datum
 values to JSON is a fantasy. 

Probably a typo on your part - valid JSON _text_ is object or array,
valid JSON value can also be number, text, true, false and null

What I am arguing for is interpreting our json type as representing a
JSON value not JSON text, this would enable users to adjust and
extend the generation of json values via defining casts for their
specific types - most notably Date* types but also things like hstore,
which has a natural JSON representation as object (a list of key:value
pairs for non-js users, a.k.a. a dictionary, hash, etc.)

 If anything, we should adjust the JSON input routines to disallow
 anything else, rather than start to output what is not valid JSON.

Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Hannu Krosing
 Sent: Tuesday, May 01, 2012 5:29 PM
 
 The reason I am whining now is that with minor adjustments in
 implementation it could all be made much more powerful (try cast to ::json
 for values before cast to ::text) and much more elegant thanks to
 PostgreSQL's built in casting.
 
 If we allowed json to hold any JSON value and tried ::json when generating
 json for compound types than we would be able to claim that PostgreSQL
 supports JSON everywhere, defaulting to representing officially unsupported
 types as strings, but allowing users to convert these to their preferred
 conventions.

I get that a JSON Text is always also a JSON Value but the reverse is not true. 
 Thus, if we define JSON to be JSON Value we cannot guarantee that the encoded 
value is a possible JSON Text - the most important property for purposes of 
data interchange.

 
 I'd also prefer to have default conversions already included for some of our
 sexier types, like intervals (just a two element array) and hstore (an object)
 etc.

Interval is not simply 2 values but also denotes whether the particular value 
is inclusive or exclusive; you would have to use an object unless you transmit 
in a text format and let the target perform the necessary interpretation of the 
string.

 
 Suddenly we would be the best match database for Web development and
 all things Ajax and also have a widely used built in and adjustable 
 interchange
 format to outer world.
 
  Second, RFC 4627 is absolutely clear: a valid JSON value can only be
  an object or an array, so this thing about converting arbitrary datum
  values to JSON is a fantasy.
 
 Probably a typo on your part - valid JSON _text_ is object or array, valid
 JSON value can also be number, text, true, false and null
 
 What I am arguing for is interpreting our json type as representing a JSON
 value not JSON text, this would enable users to adjust and extend the
 generation of json values via defining casts for their specific types - most
 notably Date* types but also things like hstore, which has a natural JSON
 representation as object (a list of key:value pairs for non-js users, 
 a.k.a. a
 dictionary, hash, etc.)

Aside from the fact it is likely too late to change the interpretation I would 
argue against doing so in any case.

Currently, the idea is to get your result all lined up and ready to go and then 
ship it off to the caller as valid JSON so that the caller does not have to do 
so itself.  Answering the question what would this value look like if it was 
part of a json output? is good; however, production use is likely to mostly 
care about the entire json interchange construct (i.e., JSON Text)

So: json - json_text; 

A JSON Value always has a textual representation but if we were to have an 
actual type it would make sense to encode it such that (strings, objects and 
arrays) are delimited while (numbers, false, true, and null) are not.

Type Name: json_value

Output Representations (all output surrounded by double-quotes since all are 
string-like) - 
String: 'VALUE' (single-quote delimiter)
Object: {...}
Array: []
Number: 0.00
Other: false, true, null

JSON is fundamentally an interchange format (especially from a database's 
perspective).  JSON Values only really have meaning if they are attached 
explicitly to a JSON Text structure, if you wanted to store one independently 
you should convert it into a native representation first.  The few exceptions 
to this would be sufficiently handled via plain text with meta-data indicating 
that the stored value is structured in directly JSON compatible syntax.  In 
short, the default context for JSON in PostgreSQL should JSON Text (not JSON 
Value) and thus the unadorned json should reflect this default (which it 
does).

 
  If anything, we should adjust the JSON input routines to disallow
  anything else, rather than start to output what is not valid JSON.
 
 I tested python, ruby and javascript in firefox and chrome, all their JSON
 generators generate 1 for standalone integer 1 and a for standalone string
 a , and none refused to convert either to JSON.
 

^Assume that we keep the meaning of json to be JSON Text; what would you 
suggest occurs if someone attempts a datum - json cast?  Given that we are 
working in a strongly-typed environment the meaning of JSON cannot be changed 
and so either the cast has to output valid JSON Text or it has to fail.  My 
personal take it is have it fail since any arbitrary decision to cast to JSON 
Text is going to make someone unhappy and supposedly they can modify their 
query so that the result generates whatever format they desire.


I haven't followed the JSON development in 9.2 too closely but exposing 
whatever conversion mechanism is currently used to generate JSON makes sense 
from a ease-of-development standpoint.  But even then, during 

Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On Tue, May 1, 2012 at 9:56 AM, Joey Adams joeyadams3.14...@gmail.comwrote:
 No, the RFC says (emphasis mine):
 
 A JSON *text* is a serialized object or array.
 
 If we let the JSON type correspond to a *value* instead, this
 restriction does not apply, and the JSON type has a useful recursive
 definition.

 I think you're playing with words. But in any case, the RFC says this
 regarding generators:
 5. Generators
A JSON generator produces JSON text.  The resulting text MUST
strictly conform to the JSON grammar.

I read over the RFC, and I think the only reason why they restricted
JSON texts to represent just a subset of JSON values is this cute
little hack in section 3 (Encoding):

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.
   00 00 00 xx  UTF-32BE
   00 xx 00 xx  UTF-16BE
   xx 00 00 00  UTF-32LE
   xx 00 xx 00  UTF-16LE
   xx xx xx xx  UTF-8

They need a guaranteed 2 ASCII characters to make that work, and
they won't necessarily get that many with a bare string literal.

Since for our purposes there is not, and never will be, any need to
figure out whether a JSON input string is encoded in UTF16 or UTF32,
I find myself agreeing with the camp that says we might as well consider
that our JSON type corresponds to JSON values not JSON texts.  I also
notice that json_in() seems to believe that already.

However, that doesn't mean I'm sold on the idea of getting rid of
array_to_json and row_to_json in favor of a universal to_json()
function.  In particular, both of those have optional pretty_bool
arguments that don't fit nicely at all in a generic conversion
function.  The meaning of that flag is very closely tied to the
input being an array or record respectively.

I'm inclined to leave these functions as they are, and consider
adding a universal to_json(anyelement) (with no options) later.
Because it would not have options, it would not be meant to cover
cases where there's value in formatting or conversion options;
so it wouldn't render the existing functions entirely obsolete,
nor would it mean there would be no need for other specialized
conversion functions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread Hannu Krosing
On Tue, 2012-05-01 at 18:35 -0400, David Johnston wrote:
  -Original Message-
  From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
  ow...@postgresql.org] On Behalf Of Hannu Krosing
  Sent: Tuesday, May 01, 2012 5:29 PM
  
  The reason I am whining now is that with minor adjustments in
  implementation it could all be made much more powerful (try cast to ::json
  for values before cast to ::text) and much more elegant thanks to
  PostgreSQL's built in casting.
  
  If we allowed json to hold any JSON value and tried ::json when generating
  json for compound types than we would be able to claim that PostgreSQL
  supports JSON everywhere, defaulting to representing officially unsupported
  types as strings, but allowing users to convert these to their preferred
  conventions.
 
 I get that a JSON Text is always also a JSON Value but the reverse is not 
 true. 
 Thus, if we define JSON to be JSON Value we cannot guarantee that the encoded 
 value is a possible JSON Text - the most important property for purposes of
 data interchange.

Nope, the most important property for purposes of data interchange is
that we produce something that the client expects and can understand
without too much extra work on client side.

The way to guarantee JSON Text is to encode objects that produce it.

I see nothing wrong with returning either a complex JSON object of
simply null if the object could not be found.  

  I'd also prefer to have default conversions already included for some of our
  sexier types, like intervals (just a two element array) and hstore (an 
  object)
  etc.
 
 Interval is not simply 2 values but also denotes whether the particular value 
 is inclusive or exclusive; you would have to use an object unless you transmit
 in a text format and let the target perform the necessary interpretation of 
 the string.

if you need that info to be passed to _your_ caller you just define a
ned cast for youtr interval-to-json which returns object notation. 

It was meant as a sample of what could be included by having generic
json values and using json casts.
 
  Suddenly we would be the best match database for Web development and
  all things Ajax and also have a widely used built in and adjustable 
  interchange
  format to outer world.
  
   Second, RFC 4627 is absolutely clear: a valid JSON value can only be
   an object or an array, so this thing about converting arbitrary datum
   values to JSON is a fantasy.
  
  Probably a typo on your part - valid JSON _text_ is object or array, valid
  JSON value can also be number, text, true, false and null
  
  What I am arguing for is interpreting our json type as representing a JSON
  value not JSON text, this would enable users to adjust and extend the
  generation of json values via defining casts for their specific types - most
  notably Date* types but also things like hstore, which has a natural JSON
  representation as object (a list of key:value pairs for non-js users, 
  a.k.a. a
  dictionary, hash, etc.)
 
 Aside from the fact it is likely too late to change the interpretation I 
 would argue against doing so in any case.
 
 Currently, the idea is to get your result all lined up and ready to go and 
 then ship it off to the caller as valid JSON so that the caller does not 
 have to do so itself.  

Valid JSON is only a small part of the equation, as I quoted before
from JSON: The Fat-Free Alternative to XML at
http://www.json.org/fatfree.html

JSON has no validator. Being well-formed and valid is not the same as
being correct and relevant. Ultimately, every application is responsible
for validating its inputs.

If we produce correct JSON text for things that are converible to JSON
text then we should be free to produce JSON values for simple value,
like everybody else (that is Javascript , python, ruby, ...)

I don't think it is postgreSQL's business to start educating people
about correct way to do JSON serialisation when everybody else does it
the generic way.


 Answering the question what would this value look like if it was part of 
 a json output? is good; however, production use is likely to mostly care 
 about the entire json interchange construct (i.e., JSON Text)

the what would it look like part is important for values that are not
covered by standard and are thus encoded as text. These need to follow
conventions outside the JSON spec proper, and thus may need to be
adjusted by the developer.

Doing it via ::json casts would be the cleanest and simplest way to deal
with it.

 So: json - json_text; 
 
 A JSON Value always has a textual representation but if we were to have an
  actual type it would make sense to encode it such that (strings, objects 
 and arrays) are delimited while (numbers, false, true, and null) are not.

And so it is, what are you trying to say here ?

 Type Name: json_value
 
 Output Representations (all output surrounded by double-quotes since all are 
 string-like) - 
 String: 'VALUE' (single-quote 

Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

2012-05-01 Thread David Johnston
On May 1, 2012, at 20:41, Hannu Krosing ha...@2ndquadrant.com wrote:
 
 Most people don't work in strongly-typed environment, and thus would
 work around such restriction if they need a simple JSON value at the
 other end of the interchange.
 
 
 My personal take it is have it fail since any arbitrary decision to cast 
 to JSON Text is going to make someone unhappy and supposedly they can 
 modify their query so that the result generates whatever format they desire.
 
 Do you actually have such an experience or is it just a wild guess ?
 
 

So even given the semantic differences between an object and a scalar I am 
better understanding where interpreting JSON as JSON Value makes sense.  
However, if I convert a record or array to JSON I expect to get a JSON Text 
even if the there is only a single column or value in the input.  

I guess my take is that record - JSON text while anything else is JSON value.  
Whether it is worth maiming the special case for record is worthwhile I really 
do not know but the semantic difference does exist; and record output is a 
significant aspect of PostgreSQL output.

I get the ease-of-use aspect but also recognize that sometimes being slightly 
harder to use is worthwhile if you eliminate ambiguities or limit the 
possibility to make mistakes.

FWIW my background on this topic is more theoretical than experiential though I 
am an web-application developer by trade and do use some JSON in that capacity.

David J.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers