subject:"Re\: \[HACKERS\] JSON for PG 9.2"

Re: [HACKERS] JSON for PG 9.2

2012-04-16 Thread Hannu Krosing

On Tue, 2012-01-31 at 12:58 -0500, Andrew Dunstan wrote:
 
 On 01/30/2012 10:37 AM, Andrew Dunstan wrote:
 
 
  Aside: is query_to_json really necessary? It seems rather ugly and
  easily avoidable using row_to_json.
 
 
  I started with this, again by analogy with query_to_xml(). But I agree 
  it's a bit ugly. If we're not going to do it, then we definitely need 
  to look at caching the output funcs in the function info. A closer 
  approximation is actually:
 
 SELECT array_to_json(array_agg(q))
 FROM ( your query here ) q;
 
 
  But then I'd want the ability to break that up a bit with line feeds, 
  so we'd need to adjust the interface slightly. (Hint: don't try the 
  above with select * from pg_class.)
 
 
  I'll wait on further comments, but I can probably turn these changes 
  around very quickly once we're agreed.
 
 
 
 
 based on Abhijit's feeling and some discussion offline, the consensus 
 seems to be to remove query_to_json.

The only comment I have here is that query_to_json could have been
replaced with json_agg, so thet you don't need to do double-buffering
for the results of array(yourquery) call in 

SELECT array_to_json(array(yourquery));

Or is there some other way to avoid it except to wrap row_to_json()
calls in own aggregate function which adds enclosing brackets and comma
separator ( like this : '['row1[,rowN]']' ?

 cheers
 
 andrew
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-04-16 Thread Andrew Dunstan




On 04/16/2012 09:34 AM, Hannu Krosing wrote:

based on Abhijit's feeling and some discussion offline, the consensus
seems to be to remove query_to_json.

The only comment I have here is that query_to_json could have been
replaced with json_agg, so thet you don't need to do double-buffering
for the results of array(yourquery) call in

SELECT array_to_json(array(yourquery));

Or is there some other way to avoid it except to wrap row_to_json()
calls in own aggregate function which adds enclosing brackets and comma
separator ( like this : '['row1[,rowN]']' ?




The way I usually write this is:

select array_to_json(array_agg(q))
from (yourquery) q;

It's a pity you didn't make this comment back in January when we were 
talking about this. I think it's too late now in this release cycle to 
be talking about adding the aggregate function.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-04-16 Thread Merlin Moncure

On Mon, Apr 16, 2012 at 9:10 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 04/16/2012 09:34 AM, Hannu Krosing wrote:

 based on Abhijit's feeling and some discussion offline, the consensus
 seems to be to remove query_to_json.

 The only comment I have here is that query_to_json could have been
 replaced with json_agg, so thet you don't need to do double-buffering
 for the results of array(yourquery) call in

 SELECT array_to_json(array(yourquery));

 Or is there some other way to avoid it except to wrap row_to_json()
 calls in own aggregate function which adds enclosing brackets and comma
 separator ( like this : '['row1[,rowN]']' ?



 The way I usually write this is:

    select array_to_json(array_agg(q))
    from (yourquery) q;

 It's a pity you didn't make this comment back in January when we were
 talking about this. I think it's too late now in this release cycle to be
 talking about adding the aggregate function.

I find array_agg to be pretty consistently slower than
array()...although not much, say around 5-10%.  I use array_agg only
when grouping.  try timing
select array_to_json(array_agg(v)) from (select v from
generate_series(1,100) v) q;
vs
select array_to_json(array(select v from generate_series(1,100) v));

I agree with Hannu but as things stand if I'm trying to avoid the
extra buffer I've found myself doing the final aggregation on the
client -- it's easy enough.  BTW, I'm using the json stuff heavily and
it's just absolutely fantastic.  Finally I can write web applications
without wondering exactly where it was that computer science went off
the rails.

I've already demoed a prototype app that integrates pg directly with
the many high quality js libraries out there and it makes things very
easy and quick by making writing data services trivial.  Data pushes
are still quite a pain but I figure something can be worked out.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-04-16 Thread Hannu Krosing

On Mon, 2012-04-16 at 10:10 -0400, Andrew Dunstan wrote:
 
 On 04/16/2012 09:34 AM, Hannu Krosing wrote:
  based on Abhijit's feeling and some discussion offline, the consensus
  seems to be to remove query_to_json.
  The only comment I have here is that query_to_json could have been
  replaced with json_agg, so thet you don't need to do double-buffering
  for the results of array(yourquery) call in
 
  SELECT array_to_json(array(yourquery));
 
  Or is there some other way to avoid it except to wrap row_to_json()
  calls in own aggregate function which adds enclosing brackets and comma
  separator ( like this : '['row1[,rowN]']' ?
 
 
 
 The way I usually write this is:
 
  select array_to_json(array_agg(q))
  from (yourquery) q;
 
 It's a pity you didn't make this comment back in January when we were 
 talking about this. I think it's too late now in this release cycle to 
 be talking about adding the aggregate function.

My comment is not meant to propose changing anything in 9.2.

I think what we have here is absolutely fantastic :)

If doing something in 9.3 then what I would like is some way to express
multiple queries. Basically a variant of 

query_to_json(query text[])

where queries would be evaluated in order and then all the results
aggregated into on json object.

But aggregation on client as suggested by Merlin may be a better way
to do it for larger result(set)s. 

Especially as it could enable streaming of the resultsets without having
to first buffer everything on the server.


If we can add something, then perhaps a deeper  pretty_print feature

samples:

hannu=# \d test
   Table public.test
 Column |Type |
Modifiers 
+-+---
 id | integer | not null default
nextval('test_id_seq'::regclass)
 data   | text| 
 tstamp | timestamp without time zone | default now()
Indexes:
test_pkey PRIMARY KEY, btree (id)

hannu=# select array_to_json(array(select test from test),true);
-[ RECORD
1 
]-+
array_to_json | [{id:1,data:0.262814193032682,tstamp:2012-04-05
13:21:03.235204},
  |  {id:2,data:0.157406373415142,tstamp:2012-04-05
13:21:05.2033}]

This is OK


hannu=# \d test2
   Table public.test2
 Column |Type |
Modifiers  
+-+
 id | integer | not null default
nextval('test2_id_seq'::regclass)
 data2  | test| 
 tstamp | timestamp without time zone | default now()
Indexes:
test2_pkey PRIMARY KEY, btree (id)

hannu=# select array_to_json(array(select test2 from test2),true);
-[ RECORD
1 
]-+---
array_to_json |
[{id:1,data2:{id:1,data:0.262814193032682,tstamp:2012-04-05
13:21:03.235204},tstamp:2012-04-05 13:25:03.644497},
  |
{id:2,data2:{id:2,data:0.157406373415142,tstamp:2012-04-05
13:21:05.2033},tstamp:2012-04-05 13:25:03.644497}]

This is kind of OK

hannu=# \d test3
   Table public.test3
 Column |Type |
Modifiers  
+-+
 id | integer | not null default
nextval('test3_id_seq'::regclass)
 data3  | test2[] | 
 tstamp | timestamp without time zone | default now()
Indexes:
test3_pkey PRIMARY KEY, btree (id)


hannu=# select array_to_json(array(select test3 from test3),true);
-[ RECORD
1 
]-+---
array_to_json |
[{id:1,data3:[{id:1,data2:{id:1,data:0.262814193032682,tstamp:2012-04-05
 13:21:03.235204},tstamp:2012-04-05 
13:25:03.644497},{id:2,data2:{id:2,data:0.157406373415142,tstamp:2012-04-05
 13:21:05.2033},tstamp:2012-04-05 13:25:03.644497}],tstamp:2012-04-16 
14:40:15.795947}]

but this would be nicer if printed like pythons pprint :

 pprint.pprint(d)
[{'data3': [{'data2': {'data': '0.262814193032682',
   'id': 1,
   'tstamp': '2012-04-05 13:21:03.235204'},
 'id': 1,
 'tstamp': '2012-04-05 13:25:03.644497'},
{'data2': {'data': '0.157406373415142',
   'id': 2,
   'tstamp': '2012-04-05 13:21:05.2033'},
 'id': 2,

Re: [HACKERS] JSON for PG 9.2

2012-04-16 Thread Merlin Moncure

On Mon, Apr 16, 2012 at 11:19 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 If doing something in 9.3 then what I would like is some way to express
 multiple queries. Basically a variant of

 query_to_json(query text[])

 where queries would be evaluated in order and then all the results
 aggregated into on json object.

I personally don't like variants of to_json that push the query in as
text. They defeat parameterization and have other issues.  Another
point for client side processing is the new row level processing in
libpq, so I'd argue that if the result is big enough to warrant
worring about buffering (and it'd have to be a mighty big json doc),
the best bet is to extract it as rows.  I'm playing around with
node.js for the json serving and the sending code looks like this:

  var first = true;

  query.on('row', function(row) {
if(first) {
  first = false;
  response.write('[');
}
else response.write(',');
response.write(row.jsondata);
  });
  query.on('end', function() {
response.write(']');
response.end();
  });

 -- not too bad

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-02-02 Thread Abhijit Menon-Sen

At 2012-02-01 11:28:50 -0500, robertmh...@gmail.com wrote:

 It's also pretty clear that JSON
 string - PG text data type is going to admit of a number of error
 conditions (transcoding errors and perhaps invalid surrogate pairs) so
 throwing one more on the pile doesn't cost much.

Hi Robert.

I'm sorry for being slow, but I don't understand what you're proposing
to do here (if anything). Could I ask you to explain, please?

Are you talking about allowing the six literal bytes \u to be
present in the JSON? If so, I agree, there seems to be no reason to
disallow it.

Are you also saying we should allow any \u sequence, without
checking for errors (e.g. invalid surrogate pairs or parts thereof)?

And what transcoding errors are you referring to?

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-02-02 Thread Robert Haas

On Thu, Feb 2, 2012 at 4:54 AM, Abhijit Menon-Sen a...@toroid.org wrote:
 At 2012-02-01 11:28:50 -0500, robertmh...@gmail.com wrote:
 It's also pretty clear that JSON
 string - PG text data type is going to admit of a number of error
 conditions (transcoding errors and perhaps invalid surrogate pairs) so
 throwing one more on the pile doesn't cost much.

 I'm sorry for being slow, but I don't understand what you're proposing
 to do here (if anything). Could I ask you to explain, please?

 Are you talking about allowing the six literal bytes \u to be
 present in the JSON? If so, I agree, there seems to be no reason to
 disallow it.

 Are you also saying we should allow any \u sequence, without
 checking for errors (e.g. invalid surrogate pairs or parts thereof)?

 And what transcoding errors are you referring to?

Consider the following JSON object:

abc

This is a JSON string.  Someone is eventually going to propose a
function with a  name like json_to_string() which, when given this
JSON object, returns a three-character string with the PostgreSQL text
type.  That's useful and I support it.  But now suppose we pass this
JSON object to that same function:

a\u0062c

The user will quite rightly expect that since this JSON object
represents the same value as the first JSON object, they're going to
get the same answer back from json_to_string(), namely abc.  So far,
so good.  But now suppose we pass this JSON object to that same
function:

a\uc

This is going to have to be an error condition, because PostgreSQL
does not allow values of type text to contain embedded NUL characters.
 Now consider this:

a\uABCDc

Suppose that \uABCD represents a character that exists in Unicode, but
the server-encoding is SHIFT-JIS or EUC-JP or some other system which
has no equivalent for the character represented by \uABCD.  Again,
when json_to_string() is applied to a value of this type, it must
fail.

In other words, we're knowingly allowing JSON strings to contain
characters which might not be representable as PostgreSQL strings,
because JSON allows any Unicode character, and the server encoding
might not be Unicode, and the server doesn't allow embedded NULs in
any encoding.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-02-02 Thread Abhijit Menon-Sen

At 2012-02-02 08:54:32 -0500, robertmh...@gmail.com wrote:

 Someone is eventually going to propose a function with a  name like
 json_to_string() which, when given this JSON object, returns a
 three-character string with the PostgreSQL text type. 

Ah, that's the bit I was missing. I thought you were talking about an
immediate error condition.

 That's useful and I support it.

Agreed. Also, now I understand that you are saying that json_to_string()
(json_string_to_text?) would fail if the result couldn't be represented
as a text in the current encoding, and that's sensible as well. I had
misunderstood is going to admit of a number of error… in your mail.

As for surrogate pairs, just to be clear, what I was proposing earlier
in the thread was to change json.c:json_lex_string() to detect errors
(e.g. only one half of a surrogate pair, which is the commonest error
I've encountered in the wild) and reject such strings.

Thanks for the explanation.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-02-01 Thread Merlin Moncure

On Tue, Jan 31, 2012 at 11:46 PM, Andrew Dunstan and...@dunslane.net wrote:
 The array(select...) locution turns out to have less flexibility than the
 array_agg(record-ref) locution.

Less flexible maybe, but it can cleaner for exactly the type of
queries that will tend to come up in exactly the type of functionality
people are looking for with JSON output.  libpqtypes does exactly the
same stuff but for C clients -- so I've done tons of this kind of
programming and am maybe a bit ahead of the curve here.  Note: while
the following contrived example may seem a bit complex it has a
certain elegance and shows how the postgres type system can whip out
document style 'nosql' objects to clients who can handle them.
Perhaps there is more simplification through syntax possible, but as
it stands things are pretty functional.   The equivalent production
through array_agg I find to be pretty awful looking although it can
produce a better plan since it doesn't force everything through
flattened subqueries:

create table foo
(
  foo_id serial primary key,
  a int
);

create table bar
(
  bar_id serial primary key,
  foo_id int references foo,
  b int
);

create table baz
(
  baz_id serial primary key,
  bar_id int references bar,
  c int
);

create type baz_t as
(
  c int
);

create type bar_t as
(
  bazs baz_t[],
  b int
);

create type foo_t as
(
  bars bar_t[],
  a int
);

INSERT INTO foo(a) VALUES (1);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 100);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 1000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 2000);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 200);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 3000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 4000);
INSERT INTO foo(a) VALUES (2);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 300);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 5000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 6000);
INSERT INTO bar(foo_id, b) VALUES (currval('foo_foo_id_seq'), 400);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 7000);
INSERT INTO baz(bar_id, c) VALUES (currval('bar_bar_id_seq'), 8000);

-- nosql!
select array(
  select row(
array(
  select row(
array(
  select row(
c
  )::baz_t from baz where baz.bar_id = bar.bar_id
)::baz_t[],
b
  )::bar_t from bar where bar.foo_id = foo.foo_id
)::bar_t[],
a
  )::foo_t from foo
)::foo_t[];

foo_t
---
 
{(\{\\(\\{(1000),(2000)}\\,100)\\,\\(\\{(3000),(4000)}\\,200)\\}\,1),(\{\\(\\{(5000),(6000)}\\,300)\\,\\(\\{(7000),(8000)}\\,400)\\}\,2)}

as you can see, the postgres default escaping format sucks for sending
nested data -- throw even one quote or backslash in there and your
data can explode in size 10+ times -- this is why we insisted on
binary.  json, of course, is much better suited for this type of
communication.   despite the complicated-ness look of the above, this
type of code is in fact very easy to write once you get the knack.
This type of coding also leads to much simpler coding on the cilent
since relationships are directly built into the structure and don't
have to be inferred or duplicated.


merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-02-01 Thread Robert Haas

On Tue, Jan 31, 2012 at 3:47 PM, Joey Adams joeyadams3.14...@gmail.com wrote:
 I'm mostly in favor of allowing \u.  Banning \u means users
 can't use JSON strings to marshal binary blobs, e.g. by escaping
 non-printable characters and only using U+..U+00FF.  Instead, they
 have to use base64 or similar.

I agree.  I mean, representing data using six bytes per source byte is
a bit unattractive from an efficiency point of view, but I'm sure
someone is going to want to do it.  It's also pretty clear that JSON
string - PG text data type is going to admit of a number of error
conditions (transcoding errors and perhaps invalid surrogate pairs) so
throwing one more on the pile doesn't cost much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Merlin Moncure

On Mon, Jan 30, 2012 at 9:37 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/30/2012 09:54 AM, Abhijit Menon-Sen wrote:

 At 2012-01-27 09:47:05 +0530, a...@toroid.org wrote:

 I've started reviewing this patch, but it'll take me a bit longer to go
 through json.c properly.

 OK, I finished reading json.c. I don't have an answer to the detoasting
 question in the XXX comment, but the code looks fine.



 Looking at somewhat analogous code in xml.c, it doesn't seem to be done
 there. SO maybe we don't need to worry about it.



 Aside: is query_to_json really necessary? It seems rather ugly and
 easily avoidable using row_to_json.


 I started with this, again by analogy with query_to_xml(). But I agree it's
 a bit ugly. If we're not going to do it, then we definitely need to look at
 caching the output funcs in the function info. A closer approximation is
 actually:

   SELECT array_to_json(array_agg(q))
   FROM ( your query here ) q;

yup -- although I'd probably write it like this most of the time:
select array_to_json(array(
  query
));

if we did have a 'query_to_json', the array() constructor would be a
lot more pleasant to to deal with than a textual query for obvious
reasons even though it's highly irregular syntax.  however, since
arrays can already handle it, I wouldn't miss it at all.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Robert Haas

On Mon, Jan 23, 2012 at 3:20 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-01-22 at 11:43 -0500, Andrew Dunstan wrote:
 Actually, given recent discussion I think that test should just be
 removed from json.c. We don't actually have any test that the code
 point is valid (e.g. that it doesn't refer to an unallocated code
 point). We don't do that elsewhere either - the unicode_to_utf8()
 function the scanner uses to turn \u escapes into utf8 doesn't
 look for unallocated code points. I'm not sure how much other
 validation we should do - for example on correct use of surrogate
 pairs.

 We do check the correctness of surrogate pairs elsewhere.  Search for
 surrogate in scan.l; should be easy to copy.

I've committed a version of this that does NOT do surrogate pair
validation.  Per discussion elsewhere, I also removed the check for
\u with   007F and database encoding != UTF8.  This will
complicate things somewhat when we get around to doing
canonicalization and comparison, but Tom seems confident that those
issues are manageble.  I did not commit Andrew's further changes,
either; I'm assuming he'll do that himself.

With respect to the issue of whether we ought to check surrogate
pairs, the JSON spec is not a whole lot of help.  RFC4627 says:

   To escape an extended character that is not in the Basic Multilingual
   Plane, the character is represented as a twelve-character sequence,
   encoding the UTF-16 surrogate pair.  So, for example, a string
   containing only the G clef character (U+1D11E) may be represented as
   \uD834\uDD1E.

That fails to answer the question of what we ought to do if we get an
invalid sequence there.  You could make an argument that we ought to
just allow it; it doesn't particularly hinder our ability to
canonicalize or compare strings, because our notion of sort-ordering
for characters that may span multiple encodings is going to be pretty
funky anyway.  We can just leave those bits as \u sequences and
call it good.  However, it would hinder our ability to convert a JSON
string to a string in the database encoding: we could find an
invalidate surrogate pair that was allowable as JSON but
unrepresentable in the database encoding.  On the flip side, given our
decision to allow all \u sequences even when not using UTF-8, we
could also run across a perfectly valid UTF-8 sequence that's not
representable as a character in the server encoding, so it seems we
have that problem anyway, so maybe it's not much worse to have two
reasons why it can happen rather than one.  On the third hand, most
people are probably using UTF-8, and those people aren't going to have
any transcoding issues, so the invalid surrogate pair case may be the
only one they can hit (unless invalid code points are also an issue?),
so maybe it's worth avoiding on that basis.

Anyway, I defer to the wisdom of the collective on this one: how
should we handle this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Andrew Dunstan




On 01/30/2012 10:37 AM, Andrew Dunstan wrote:




Aside: is query_to_json really necessary? It seems rather ugly and
easily avoidable using row_to_json.



I started with this, again by analogy with query_to_xml(). But I agree 
it's a bit ugly. If we're not going to do it, then we definitely need 
to look at caching the output funcs in the function info. A closer 
approximation is actually:


   SELECT array_to_json(array_agg(q))
   FROM ( your query here ) q;


But then I'd want the ability to break that up a bit with line feeds, 
so we'd need to adjust the interface slightly. (Hint: don't try the 
above with select * from pg_class.)



I'll wait on further comments, but I can probably turn these changes 
around very quickly once we're agreed.






based on Abhijit's feeling and some discussion offline, the consensus 
seems to be to remove query_to_json.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Josh Berkus

Andrew,

 based on Abhijit's feeling and some discussion offline, the consensus
 seems to be to remove query_to_json.

If we do that, what would getting complete query results back from a
query look like?  It's important to make this as simple for developers
as possible.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Abhijit Menon-Sen

At 2012-01-31 12:04:31 -0500, robertmh...@gmail.com wrote:

 That fails to answer the question of what we ought to do if we get an
 invalid sequence there.

I think it's best to categorically reject invalid surrogates as early as
possible, considering the number of bugs that are related to them (not
in Postgres, just in general). I can't see anything good coming from
letting them in and leaving them to surprise someone in future.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Merlin Moncure

On Tue, Jan 31, 2012 at 12:15 PM, Josh Berkus j...@agliodbs.com wrote:
 Andrew,

 based on Abhijit's feeling and some discussion offline, the consensus
 seems to be to remove query_to_json.

 If we do that, what would getting complete query results back from a
 query look like?  It's important to make this as simple for developers
 as possible.

two options:
1. row_to_json(rowvar)
SELECT row_to_json(foo) from foo;
SELECT row_to_json(row(a,b,c)) from foo;

2. array_to_json(array_agg()/array())
SELECT array_to_json(array(select foo from foo));
SELECT array_to_json(array[1,2,3]);

#1 I expect will be the more used version -- most json handling client
side api (for example node.js drivers) are optimized for row by row
processing, but via #2 you can stuff a whole query into single json
object if you're so inclined.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Andrew Dunstan




On 01/31/2012 01:32 PM, Merlin Moncure wrote:

On Tue, Jan 31, 2012 at 12:15 PM, Josh Berkusj...@agliodbs.com  wrote:

Andrew,


based on Abhijit's feeling and some discussion offline, the consensus
seems to be to remove query_to_json.

If we do that, what would getting complete query results back from a
query look like?  It's important to make this as simple for developers
as possible.

two options:
1. row_to_json(rowvar)
SELECT row_to_json(foo) from foo;
SELECT row_to_json(row(a,b,c)) from foo;

2. array_to_json(array_agg()/array())
SELECT array_to_json(array(select foo from foo));
SELECT array_to_json(array[1,2,3]);

#1 I expect will be the more used version -- most json handling client
side api (for example node.js drivers) are optimized for row by row
processing, but via #2 you can stuff a whole query into single json
object if you're so inclined.



You could also write a wrapper something like this:

   create function query_to_json(qtext text) returns json language
   plpgsql as
   $$
   begin
return query execute 'select array_to_json(array(' || qtext ||
   '))';
   end;
   $$;

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Merlin Moncure

On Tue, Jan 31, 2012 at 12:48 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/31/2012 01:32 PM, Merlin Moncure wrote:
 On Tue, Jan 31, 2012 at 12:15 PM, Josh Berkusj...@agliodbs.com  wrote:

 Andrew,

 based on Abhijit's feeling and some discussion offline, the consensus
 seems to be to remove query_to_json.

 If we do that, what would getting complete query results back from a
 query look like?  It's important to make this as simple for developers
 as possible.

 two options:
 1. row_to_json(rowvar)
 SELECT row_to_json(foo) from foo;
 SELECT row_to_json(row(a,b,c)) from foo;

 2. array_to_json(array_agg()/array())
 SELECT array_to_json(array(select foo from foo));
 SELECT array_to_json(array[1,2,3]);

 #1 I expect will be the more used version -- most json handling client
 side api (for example node.js drivers) are optimized for row by row
 processing, but via #2 you can stuff a whole query into single json
 object if you're so inclined.


 You could also write a wrapper something like this:

   create function query_to_json(qtext text) returns json language
   plpgsql as
   $$
   begin
        return query execute 'select array_to_json(array(' || qtext ||
   '))';
   end;
   $$;

right -- then you can leverage execute/using parameterization etc.
and/or rig a variadic version.

The major hole in functionality I see for heavy json users is the
reverse; how do you get json back into the database?  With xml, at
least you could (ab)use xpath for that...with json you have to rely on
add-on support and/or  ad hoc string parsing (that is, unless I'm
missing something -- I just noted Robert's commit of the JSON type).

since we can do:
select array_to_json(array(select foo from foo));

it seems natural to be able to want do do something like:
WITH foos AS (SELECT a_json_var::foo[] AS f)
(
  INSERT INTO foo SELECT (f).* FROM foos
);

Of course, you'd have to have non-anonymous (that is, defined with
CREATE TYPE AS) types defined to receive all the data, but that's not
so bad.  Also, could xxx_to_json be hypothetically executed via casts?

e.g. select array(select foo from foo)::json;

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Andrew Dunstan




On 01/31/2012 02:49 PM, Merlin Moncure wrote:
The major hole in functionality I see for heavy json users is the 
reverse; how do you get json back into the database? With xml, at 
least you could (ab)use xpath for that...with json you have to rely on 
add-on support and/or ad hoc string parsing (that is, unless I'm 
missing something -- I just noted Robert's commit of the JSON type). 
since we can do: select array_to_json(array(select foo from foo)); it 
seems natural to be able to want do do something like: WITH foos AS 
(SELECT a_json_var::foo[] AS f) ( INSERT INTO foo SELECT (f).* FROM 
foos ); Of course, you'd have to have non-anonymous (that is, defined 
with CREATE TYPE AS) types defined to receive all the data, but that's 
not so bad. Also, could xxx_to_json be hypothetically executed via 
casts? e.g. select array(select foo from foo)::json;


At some stage there will possibly be some json-processing (as opposed to 
json-producing) functions, but not in 9.2 - it's too late for that. 
Until then there is PLV8: see 
http://people.planetpostgresql.org/andrew/index.php?/archives/249-Using-PLV8-to-index-JSON.html 
for example. Or someone could write an extension.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Joey Adams

On Tue, Jan 31, 2012 at 1:29 PM, Abhijit Menon-Sen a...@toroid.org wrote:
 At 2012-01-31 12:04:31 -0500, robertmh...@gmail.com wrote:

 That fails to answer the question of what we ought to do if we get an
 invalid sequence there.

 I think it's best to categorically reject invalid surrogates as early as
 possible, considering the number of bugs that are related to them (not
 in Postgres, just in general). I can't see anything good coming from
 letting them in and leaving them to surprise someone in future.

 -- ams

+1

Another sequence to beware of is \u.  While escaped NUL characters
are perfectly valid in JSON, NUL characters aren't allowed in TEXT
values.  This means not all JSON strings can be converted to TEXT,
even in UTF-8.  This may also complicate collation, if comparison
functions demand null-terminated strings.

I'm mostly in favor of allowing \u.  Banning \u means users
can't use JSON strings to marshal binary blobs, e.g. by escaping
non-printable characters and only using U+..U+00FF.  Instead, they
have to use base64 or similar.

Banning \u doesn't quite violate the RFC:

An implementation may set limits on the length and character
contents of strings.

-Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-31 Thread Andrew Dunstan




On 01/31/2012 01:48 PM, Andrew Dunstan wrote:



On 01/31/2012 01:32 PM, Merlin Moncure wrote:

On Tue, Jan 31, 2012 at 12:15 PM, Josh Berkusj...@agliodbs.com  wrote:

Andrew,


based on Abhijit's feeling and some discussion offline, the consensus
seems to be to remove query_to_json.

If we do that, what would getting complete query results back from a
query look like?  It's important to make this as simple for developers
as possible.

two options:
1. row_to_json(rowvar)
SELECT row_to_json(foo) from foo;
SELECT row_to_json(row(a,b,c)) from foo;

2. array_to_json(array_agg()/array())
SELECT array_to_json(array(select foo from foo));
SELECT array_to_json(array[1,2,3]);

#1 I expect will be the more used version -- most json handling client
side api (for example node.js drivers) are optimized for row by row
processing, but via #2 you can stuff a whole query into single json
object if you're so inclined.



You could also write a wrapper something like this:

   create function query_to_json(qtext text) returns json language
   plpgsql as
   $$
   begin
return query execute 'select array_to_json(array(' || qtext ||
   '))';
   end;
   $$;



The array(select...) locution turns out to have less flexibility than 
the array_agg(record-ref) locution.


Anyway, for those playing along, I have removed query_to_json, and added 
optional pretty printing to array_to_json and row_to_json. WIP can be 
seen at https://bitbucket.org/adunstan/pgdevel. I still have docs and 
output function caching to do, and should post a revised patch within 
the next 48 hours.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-30 Thread Abhijit Menon-Sen

At 2012-01-27 09:47:05 +0530, a...@toroid.org wrote:

 I've started reviewing this patch, but it'll take me a bit longer to go
 through json.c properly.

OK, I finished reading json.c. I don't have an answer to the detoasting
question in the XXX comment, but the code looks fine.

Aside: is query_to_json really necessary? It seems rather ugly and
easily avoidable using row_to_json.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-30 Thread Andrew Dunstan




On 01/30/2012 09:54 AM, Abhijit Menon-Sen wrote:

At 2012-01-27 09:47:05 +0530, a...@toroid.org wrote:

I've started reviewing this patch, but it'll take me a bit longer to go
through json.c properly.

OK, I finished reading json.c. I don't have an answer to the detoasting
question in the XXX comment, but the code looks fine.



Looking at somewhat analogous code in xml.c, it doesn't seem to be done 
there. SO maybe we don't need to worry about it.




Aside: is query_to_json really necessary? It seems rather ugly and
easily avoidable using row_to_json.



I started with this, again by analogy with query_to_xml(). But I agree 
it's a bit ugly. If we're not going to do it, then we definitely need to 
look at caching the output funcs in the function info. A closer 
approximation is actually:


   SELECT array_to_json(array_agg(q))
   FROM ( your query here ) q;


But then I'd want the ability to break that up a bit with line feeds, so 
we'd need to adjust the interface slightly. (Hint: don't try the above 
with select * from pg_class.)



I'll wait on further comments, but I can probably turn these changes 
around very quickly once we're agreed.



Cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-26 Thread Abhijit Menon-Sen

At 2012-01-15 11:08:05 -0500, and...@dunslane.net wrote:

 Here's an update that adds row_to_json, plus a bit more cleanup.

I've started reviewing this patch, but it'll take me a bit longer to go
through json.c properly. Here are a few preliminary notes:

1. The patch has a lot of whitespace errors (primarily lines ending in
   whitespace), but applying with git apply --whitespace=fix isn't safe,
   because the test results need some (but not all) of those spaces. I
   applied the patch, backed out the changes to expected/json.out, and
   created the file from the patch, then removed the superfluous
   whitespace.

2. I bumped some function OIDs to avoid conflicts.

3. One documentation typo.

Everything other than json.c (which I haven't read yet) looks fine
(builds, passes tests). I've attached a patch covering the changes
I made.

More later.

-- ams
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4f8b35e..ce4c4f6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9626,7 +9626,7 @@ table2-mapping
   /indexterm
 
   para
-This section descripbes the functions that are available for creating
+This section describes the functions that are available for creating
 JSON (see xref linkend=datatype-json) data.
   /para
 
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index cdfa4cc..02c8679 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4027,11 +4027,11 @@ DATA(insert OID = 323 (  json_recv		   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 11
 DESCR(I/O);
 DATA(insert OID = 324 (  json_send		   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 17 114 _null_ _null_ _null_ _null_ json_send _null_ _null_ _null_ ));
 DESCR(I/O);
-DATA(insert OID = 3144 (  query_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 2 0 114 25 16 _null_ _null_ _null_ _null_ query_to_json _null_ _null_ _null_ ));
+DATA(insert OID = 3153 (  query_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 2 0 114 25 16 _null_ _null_ _null_ _null_ query_to_json _null_ _null_ _null_ ));
 DESCR(I/O);
-DATA(insert OID = 3145 (  array_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2277 _null_ _null_ _null_ _null_ array_to_json _null_ _null_ _null_ ));
+DATA(insert OID = 3154 (  array_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2277 _null_ _null_ _null_ _null_ array_to_json _null_ _null_ _null_ ));
 DESCR(I/O);
-DATA(insert OID = 3146 (  row_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2249 _null_ _null_ _null_ _null_ row_to_json _null_ _null_ _null_ ));
+DATA(insert OID = 3155 (  row_to_json	   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 114 2249 _null_ _null_ _null_ _null_ row_to_json _null_ _null_ _null_ ));
 DESCR(I/O);
 
 /* uuid */
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index b975d72..1984106 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -278,11 +278,11 @@ SELECT query_to_json('select x as b, x * 2 as c from generate_series(1,3) x',tru
 (1 row)
 
 SELECT query_to_json('
-  SELECT $$a$$ || x AS b, 
- y AS c, 
+  SELECT $$a$$ || x AS b,
+ y AS c,
  ARRAY[ROW(x.*,ARRAY[1,2,3]),
-   ROW(y.*,ARRAY[4,5,6])] AS z 
-  FROM generate_series(1,2) x, 
+   ROW(y.*,ARRAY[4,5,6])] AS z
+  FROM generate_series(1,2) x,
generate_series(4,5) y',true);
 query_to_json 
 --
@@ -299,19 +299,19 @@ SELECT query_to_json('select array_agg(x) as d from generate_series(5,10) x',fal
 (1 row)
 
 -- array_to_json
-SELECT array_to_json(array_agg(x)) 
+SELECT array_to_json(array_agg(x))
 FROM generate_series(1,10) x;
  array_to_json  
 
  [1,2,3,4,5,6,7,8,9,10]
 (1 row)
 
-SELECT array_to_json(array_agg(q)) 
-FROM (SELECT $$a$$ || x AS b, 
- y AS c, 
+SELECT array_to_json(array_agg(q))
+FROM (SELECT $$a$$ || x AS b,
+ y AS c,
  ARRAY[ROW(x.*,ARRAY[1,2,3]),
-   ROW(y.*,ARRAY[4,5,6])] AS z 
-  FROM generate_series(1,2) x, 
+   ROW(y.*,ARRAY[4,5,6])] AS z
+  FROM generate_series(1,2) x,
generate_series(4,5) y) q;
  array_to_json 
 ---
@@ -331,12 +331,12 @@ SELECT row_to_json(row(1,'foo'));
  {f1:1,f2:foo}
 (1 row)
 
-SELECT row_to_json(q) 
-FROM (SELECT $$a$$ || x AS b, 
- y AS c, 
+SELECT

Re: [HACKERS] JSON for PG 9.2

2012-01-23 Thread Peter Eisentraut

On sön, 2012-01-22 at 11:43 -0500, Andrew Dunstan wrote:
 Actually, given recent discussion I think that test should just be 
 removed from json.c. We don't actually have any test that the code
 point is valid (e.g. that it doesn't refer to an unallocated code
 point). We don't do that elsewhere either - the unicode_to_utf8()
 function the scanner uses to turn \u escapes into utf8 doesn't
 look for unallocated code points. I'm not sure how much other
 validation we should do - for example on correct use of surrogate
 pairs.

We do check the correctness of surrogate pairs elsewhere.  Search for
surrogate in scan.l; should be easy to copy.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-23 Thread Merlin Moncure

On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net wrote:
 Here's an update that adds row_to_json, plus a bit more cleanup.

why not call all these functions 'to_json' and overload them?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-23 Thread Andrew Dunstan




On 01/23/2012 05:21 PM, Merlin Moncure wrote:

On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstanand...@dunslane.net  wrote:

Here's an update that adds row_to_json, plus a bit more cleanup.

why not call all these functions 'to_json' and overload them?




I don't honestly feel that advances clarity much. And we might want to 
overload each at some stage with options that are specific to the datum 
type. We have various foo_to_xml() functions now.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-23 Thread Pavel Stehule

2012/1/23 Merlin Moncure mmonc...@gmail.com:
 On Sun, Jan 15, 2012 at 10:08 AM, Andrew Dunstan and...@dunslane.net wrote:
 Here's an update that adds row_to_json, plus a bit more cleanup.

 why not call all these functions 'to_json' and overload them?

-1

older proposal is more consistent with xml functions

Pavel


 merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-22 Thread Andrew Dunstan




On 01/21/2012 11:40 PM, Jeff Janes wrote:

On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstanand...@dunslane.net  wrote:


On 01/14/2012 03:06 PM, Andrew Dunstan wrote:




OK, here's a patch that does both query_to_json and array_to_json, along
with docs and regression tests. It include Robert's original patch, although
I can produce a differential patch if required. It can also be pulled from
https://bitbucket.org/adunstan/pgdevel




Here's an update that adds row_to_json, plus a bit more cleanup.

This is bit-rotted such that initdb fails

creating template1 database in
/tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL:  could
not create unique index pg_proc_oid_index
DETAIL:  Key (oid)=(3145) is duplicated.

I bumped up those oids in the patch, and it passes make check once I
figure out how to get the test run under UTF-8.  Is it supposed to
pass under other encodings?  I can't tell from the rest of thread
whether it supposed to pass in other encodings or not.



Yeah, regression tests generally are supposed to run in all encodings. 
Either we could knock out the offending test, or we could supply an 
alternative result file. If we do the latter, maybe we should modify the 
query slightly, so it reads


   SELECT 'getdatabaseencoding() = 'UTF8' as is_utf8, \uaBcD'::json;

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-22 Thread Andrew Dunstan




On 01/22/2012 04:28 AM, Andrew Dunstan wrote:



On 01/21/2012 11:40 PM, Jeff Janes wrote:
On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstanand...@dunslane.net  
wrote:


On 01/14/2012 03:06 PM, Andrew Dunstan wrote:




OK, here's a patch that does both query_to_json and array_to_json, 
along
with docs and regression tests. It include Robert's original patch, 
although
I can produce a differential patch if required. It can also be 
pulled from

https://bitbucket.org/adunstan/pgdevel




Here's an update that adds row_to_json, plus a bit more cleanup.

This is bit-rotted such that initdb fails

creating template1 database in
/tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL:  could
not create unique index pg_proc_oid_index
DETAIL:  Key (oid)=(3145) is duplicated.

I bumped up those oids in the patch, and it passes make check once I
figure out how to get the test run under UTF-8.  Is it supposed to
pass under other encodings?  I can't tell from the rest of thread
whether it supposed to pass in other encodings or not.



Yeah, regression tests generally are supposed to run in all encodings. 
Either we could knock out the offending test, or we could supply an 
alternative result file. If we do the latter, maybe we should modify 
the query slightly, so it reads


   SELECT 'getdatabaseencoding() = 'UTF8' as is_utf8, \uaBcD'::json;




Actually, given recent discussion I think that test should just be 
removed from json.c. We don't actually have any test that the code point 
is valid (e.g. that it doesn't refer to an unallocated code point). We 
don't do that elsewhere either - the unicode_to_utf8() function the 
scanner uses to turn \u escapes into utf8 doesn't look for 
unallocated code points. I'm not sure how much other validation we 
should do - for example on correct use of surrogate pairs. I'd rather 
get this as right as possible now - every time we tighten encoding rules 
to make sure incorrectly encoded data doesn't get into the database it 
causes someone real pain.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-21 Thread Jeff Janes

On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 01/14/2012 03:06 PM, Andrew Dunstan wrote:




 OK, here's a patch that does both query_to_json and array_to_json, along
 with docs and regression tests. It include Robert's original patch, although
 I can produce a differential patch if required. It can also be pulled from
 https://bitbucket.org/adunstan/pgdevel




 Here's an update that adds row_to_json, plus a bit more cleanup.

This is bit-rotted such that initdb fails

creating template1 database in
/tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL:  could
not create unique index pg_proc_oid_index
DETAIL:  Key (oid)=(3145) is duplicated.

I bumped up those oids in the patch, and it passes make check once I
figure out how to get the test run under UTF-8.  Is it supposed to
pass under other encodings?  I can't tell from the rest of thread
whether it supposed to pass in other encodings or not.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Robert Haas

On Fri, Jan 20, 2012 at 12:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 01/19/2012 04:12 PM, Robert Haas wrote:
 On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstanand...@dunslane.net  wrote:
 The spec only allows unescaped Unicode chars (and for our purposes that
 means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
 result in something that's not legal JSON.

 I understand.  I'm proposing that we not care.  In other words, if the
 server encoding is UTF-8, it'll really be JSON.  But if the server
 encoding is something else, it'll be almost-JSON.

 Of course, for data going to the client, if the client encoding is UTF8,
 they should get legal JSON, regardless of what the database encoding is,
 and conversely too, no?

 Yes.  I think this argument has been mostly theologizing, along the
 lines of how many JSON characters can dance on the head of a pin.
 From a user's perspective, the database encoding is only a constraint on
 which characters he can store.

Bingo.

 He does not know or care what the bit
 representation is inside the server.  As such, if we store a non-ASCII
 character in a JSON string, it's valid JSON as far as the user is
 concerned, so long as that character exists in the Unicode standard.
 If his client encoding is UTF8, the value will be letter-perfect JSON
 when it gets to him; and if his client encoding is not UTF8, then he's
 already pretty much decided that he doesn't give a fig about the
 Unicode-centricity of the JSON spec, no?

Also agreed.  Personally, I think it may not have been a great idea to
tie the JSON spec so closely to Unicode, but I understand that it
would have been difficult to define an encoding-agnostic equivalent of
\u, since it's hard to know for sure whether an arbitrary encoding
even has a (sensible?) definition of code points, and they probably
wanted to avoid ambiguity.  But, it's bound to cause problems for any
system that runs in some other encoding, which, when so requested, we
do.  Even if we had the ability to support multiple encodings in the
same database, I'm not sure I'd be very excited about insisting that
JSON data always be stored in UTF-8, because that would introduce a
lot of unnecessary transcoding for people using other encodings and
basically unnecessarily handicap the functionality provided by the
datatype.  But at least if we had that, people would have the *option*
to use JSON with UTF-8 and get the fully spec-compliant behavior.  As
it is, they don't; the system we have forces the database encoding on
all datatypes whether they like it or not, and that ain't changing for
9.2.

 So I'm with Robert: we should just plain not care.  I would further
 suggest that maybe what we should do with incoming JSON escape sequences
 is convert them to Unicode code points and then to the equivalent
 character in the database encoding (or throw error if there is none).

The code I've written so far does no canonicalization of the input
value of any kind, just as we do for XML.  I'm inclined to leave it
that way.  Eventually, we might want to make the JSON datatype support
equality comparisons and so on, and that will require the system to
knowing that the letter r can be encoded as some \u sequence and
that the escape \r is equivalent to some other escape \u, but
right now all the code does is try to validate that the JSON is legal,
NOT second-guess the user's choice about how to spell things or where
to insert whitespace.  I think that's a good choice because (1) AFAIK,
there's no official canonicalization method for JSON, so whatever we
pick will be something we think is best, not an official method
sanction by the spec, (2) users might prefer the way they chose to
represent a given value over the way we choose to represent it, and
(3) by simply validating and storing the JSON object, rather than
doing any canonicalization, the input function avoids the need to do
any data copying, hopefully maximizing speed.  Canonicalization can be
added on top of what I've done here and people who want or need it can
use it; I have some ideas around how to make that leverage the
existing code that I intend to pursue for 9.3, but right now I'd
rather not go there.

So, given that framework, what the patch does is this: if you're using
UTF-8, then \u is accepted, provided that  is something that
equates to a legal Unicode code point.  It isn't converted to the
corresponding character: it's just validated.  If you're NOT using
UTF-8, then it allows \u for code points up through 127 (which we
assume are the same in all encodings) and anything higher than that is
rejected.  If someone knows an easy way to check whether a \u
sequence for   007F is a legal Unicode code point that has an
equivalent in the current server encoding, then we can add logic to
allow that case also, but personally I'm not that excited about it.
Anyone who is using \u escapes with a non-Unicode coding

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 The code I've written so far does no canonicalization of the input
 value of any kind, just as we do for XML.

Fair enough.

 So, given that framework, what the patch does is this: if you're using
 UTF-8, then \u is accepted, provided that  is something that
 equates to a legal Unicode code point.  It isn't converted to the
 corresponding character: it's just validated.  If you're NOT using
 UTF-8, then it allows \u for code points up through 127 (which we
 assume are the same in all encodings) and anything higher than that is
 rejected.

This seems a bit silly.  If you're going to leave the escape sequence as
ASCII, then why not just validate that it names a legal Unicode code
point and be done?  There is no reason whatever that that behavior needs
to depend on the database encoding.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Robert Haas

On Fri, Jan 20, 2012 at 10:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The code I've written so far does no canonicalization of the input
 value of any kind, just as we do for XML.

 Fair enough.

 So, given that framework, what the patch does is this: if you're using
 UTF-8, then \u is accepted, provided that  is something that
 equates to a legal Unicode code point.  It isn't converted to the
 corresponding character: it's just validated.  If you're NOT using
 UTF-8, then it allows \u for code points up through 127 (which we
 assume are the same in all encodings) and anything higher than that is
 rejected.

 This seems a bit silly.  If you're going to leave the escape sequence as
 ASCII, then why not just validate that it names a legal Unicode code
 point and be done?  There is no reason whatever that that behavior needs
 to depend on the database encoding.

Mostly because that would prevent us from adding canonicalization in
the future, AFAICS, and I don't want to back myself into a corner.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Andrew Dunstan




On 01/20/2012 09:19 AM, Robert Haas wrote:

On Fri, Jan 20, 2012 at 12:07 AM, Tom Lanet...@sss.pgh.pa.us  wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 01/19/2012 04:12 PM, Robert Haas wrote:

On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstanand...@dunslane.netwrote:

The spec only allows unescaped Unicode chars (and for our purposes that
means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
result in something that's not legal JSON.

I understand.  I'm proposing that we not care.  In other words, if the
server encoding is UTF-8, it'll really be JSON.  But if the server
encoding is something else, it'll be almost-JSON.

Of course, for data going to the client, if the client encoding is UTF8,
they should get legal JSON, regardless of what the database encoding is,
and conversely too, no?

Yes.  I think this argument has been mostly theologizing, along the
lines of how many JSON characters can dance on the head of a pin.
 From a user's perspective, the database encoding is only a constraint on
which characters he can store.

Bingo.


He does not know or care what the bit
representation is inside the server.  As such, if we store a non-ASCII
character in a JSON string, it's valid JSON as far as the user is
concerned, so long as that character exists in the Unicode standard.
If his client encoding is UTF8, the value will be letter-perfect JSON
when it gets to him; and if his client encoding is not UTF8, then he's
already pretty much decided that he doesn't give a fig about the
Unicode-centricity of the JSON spec, no?

Also agreed.  Personally, I think it may not have been a great idea to
tie the JSON spec so closely to Unicode, but I understand that it
would have been difficult to define an encoding-agnostic equivalent of
\u, since it's hard to know for sure whether an arbitrary encoding
even has a (sensible?) definition of code points, and they probably
wanted to avoid ambiguity.  But, it's bound to cause problems for any
system that runs in some other encoding, which, when so requested, we
do.  Even if we had the ability to support multiple encodings in the
same database, I'm not sure I'd be very excited about insisting that
JSON data always be stored in UTF-8, because that would introduce a
lot of unnecessary transcoding for people using other encodings and
basically unnecessarily handicap the functionality provided by the
datatype.  But at least if we had that, people would have the *option*
to use JSON with UTF-8 and get the fully spec-compliant behavior.  As
it is, they don't; the system we have forces the database encoding on
all datatypes whether they like it or not, and that ain't changing for
9.2.


So I'm with Robert: we should just plain not care.  I would further
suggest that maybe what we should do with incoming JSON escape sequences
is convert them to Unicode code points and then to the equivalent
character in the database encoding (or throw error if there is none).

The code I've written so far does no canonicalization of the input
value of any kind, just as we do for XML.  I'm inclined to leave it
that way.  Eventually, we might want to make the JSON datatype support
equality comparisons and so on, and that will require the system to
knowing that the letter r can be encoded as some \u sequence and
that the escape \r is equivalent to some other escape \u, but
right now all the code does is try to validate that the JSON is legal,
NOT second-guess the user's choice about how to spell things or where
to insert whitespace.  I think that's a good choice because (1) AFAIK,
there's no official canonicalization method for JSON, so whatever we
pick will be something we think is best, not an official method
sanction by the spec, (2) users might prefer the way they chose to
represent a given value over the way we choose to represent it, and
(3) by simply validating and storing the JSON object, rather than
doing any canonicalization, the input function avoids the need to do
any data copying, hopefully maximizing speed.  Canonicalization can be
added on top of what I've done here and people who want or need it can
use it; I have some ideas around how to make that leverage the
existing code that I intend to pursue for 9.3, but right now I'd
rather not go there.

So, given that framework, what the patch does is this: if you're using
UTF-8, then \u is accepted, provided that  is something that
equates to a legal Unicode code point.  It isn't converted to the
corresponding character: it's just validated.  If you're NOT using
UTF-8, then it allows \u for code points up through 127 (which we
assume are the same in all encodings) and anything higher than that is
rejected.  If someone knows an easy way to check whether a \u
sequence for   007F is a legal Unicode code point that has an
equivalent in the current server encoding, then we can add logic to
allow that case also, but personally I'm not that excited about it.
Anyone who is using \u escapes

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Robert Haas

On Fri, Jan 20, 2012 at 10:45 AM, Andrew Dunstan and...@dunslane.net wrote:
 XML's #; escape mechanism is more or less the equivalent of JSON's
 \u. But XML documents can be encoded in a variety of encodings,
 including non-unicode encodings such as Latin-1. However, no matter what the
 document encoding, #; designates the character with Unicode code point
 , whether or not that is part of the document encoding's charset.

OK.

 Given that precedent, I'm wondering if we do need to enforce anything other
 than that it is a valid unicode code point.

 Equivalence comparison is going to be difficult anyway if you're not
 resolving all \u escapes. Possibly we need some sort of canonicalization
 function to apply for comparison purposes. But we're not providing any
 comparison ops today anyway, so I don't think we need to make that decision
 now. As you say, there doesn't seem to be any defined canonical form - the
 spec is a bit light on in this respect.

Well, we clearly have to resolve all \u to do either comparison or
canonicalization.  The current patch does neither, but presumably we
want to leave the door open to such things.  If we're using UTF-8 and
comparing two strings, and we get to a position where one of them has
a character and the other has \u, it's pretty simple to do the
comparison: we just turn  into a wchar_t and test for equality.
That should be trivial, unless I'm misunderstanding.  If, however,
we're not using UTF-8, we have to first turn \u into a Unicode
code point, then covert that to a character in the database encoding,
and then test for equality with the other character after that.  I'm
not sure whether that's possible in general, how to do it, or how
efficient it is.  Can you or anyone shed any light on that topic?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread David E. Wheeler

On Jan 19, 2012, at 9:07 PM, Tom Lane wrote:

 If his client encoding is UTF8, the value will be letter-perfect JSON
 when it gets to him; and if his client encoding is not UTF8, then he's
 already pretty much decided that he doesn't give a fig about the
 Unicode-centricity of the JSON spec, no?

Don’t entirely agree with this. Some folks are stuck with other encodings and 
cannot change them for one reason or another. That said, they can convert JSON 
from their required encoding into UTF-8 on the client side, so there is a 
workaround.

Not that this changes anything, and I agree with the overall direction of the 
discussion here. I just want to make sure we keep in mind folks who don’t 
necessarily have the freedom to switch to UTF-8. (And I say this as someone who 
*always* uses UTF-8!)

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread David E. Wheeler

On Jan 20, 2012, at 8:58 AM, Robert Haas wrote:

 If, however,
 we're not using UTF-8, we have to first turn \u into a Unicode
 code point, then covert that to a character in the database encoding,
 and then test for equality with the other character after that.  I'm
 not sure whether that's possible in general, how to do it, or how
 efficient it is.  Can you or anyone shed any light on that topic?

If it’s like the XML example, it should always represent a Unicode code point, 
and *not* be converted to the other character set, no?

At any rate, since the JSON standard requires UTF-8, such distinctions having 
to do with alternate encodings are not likely to be covered, so I suspect we 
can do whatever we want here. It’s outside the spec.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Garick Hamlin

On Fri, Jan 20, 2012 at 09:12:13AM -0800, David E. Wheeler wrote:
 On Jan 19, 2012, at 9:07 PM, Tom Lane wrote:
 
  If his client encoding is UTF8, the value will be letter-perfect JSON
  when it gets to him; and if his client encoding is not UTF8, then he's
  already pretty much decided that he doesn't give a fig about the
  Unicode-centricity of the JSON spec, no?
 
 Don’t entirely agree with this. Some folks are stuck with other encodings and
 cannot change them for one reason or another. That said, they can convert
 JSON from their required encoding into UTF-8 on the client side, so there is
 a workaround.

Perhaps in addition to trying to just 'do the right thing by default',
it makes sense to have a two canonicalization functions?

Say: json_utf8() and json_ascii().

They could give the same output no matter what encoding was set? 

json_utf8 would give nice output where characters were canonicalized to 
native utf8 characters and json_ascii() would output only non-control
ascii characters literally and escape everything else or something
like that?

Garick

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Andrew Dunstan




On 01/20/2012 11:58 AM, Robert Haas wrote:

On Fri, Jan 20, 2012 at 10:45 AM, Andrew Dunstanand...@dunslane.net  wrote:

XML's#; escape mechanism is more or less the equivalent of JSON's
\u. But XML documents can be encoded in a variety of encodings,
including non-unicode encodings such as Latin-1. However, no matter what the
document encoding,#; designates the character with Unicode code point
, whether or not that is part of the document encoding's charset.

OK.


Given that precedent, I'm wondering if we do need to enforce anything other
than that it is a valid unicode code point.

Equivalence comparison is going to be difficult anyway if you're not
resolving all \u escapes. Possibly we need some sort of canonicalization
function to apply for comparison purposes. But we're not providing any
comparison ops today anyway, so I don't think we need to make that decision
now. As you say, there doesn't seem to be any defined canonical form - the
spec is a bit light on in this respect.

Well, we clearly have to resolve all \u to do either comparison or
canonicalization.  The current patch does neither, but presumably we
want to leave the door open to such things.  If we're using UTF-8 and
comparing two strings, and we get to a position where one of them has
a character and the other has \u, it's pretty simple to do the
comparison: we just turn  into a wchar_t and test for equality.
That should be trivial, unless I'm misunderstanding.  If, however,
we're not using UTF-8, we have to first turn \u into a Unicode
code point, then covert that to a character in the database encoding,
and then test for equality with the other character after that.  I'm
not sure whether that's possible in general, how to do it, or how
efficient it is.  Can you or anyone shed any light on that topic?



We know perfectly well how to turn two strings from encoding x to utf8 
(see mb_utils.c::pg_do_encoding_conversion() ). Once we've done that 
ISTM we have reduced this to the previous problem, as the mathematicians 
like to say.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Robert Haas

On Fri, Jan 20, 2012 at 12:14 PM, David E. Wheeler da...@kineticode.com wrote:
 On Jan 20, 2012, at 8:58 AM, Robert Haas wrote:

 If, however,
 we're not using UTF-8, we have to first turn \u into a Unicode
 code point, then covert that to a character in the database encoding,
 and then test for equality with the other character after that.  I'm
 not sure whether that's possible in general, how to do it, or how
 efficient it is.  Can you or anyone shed any light on that topic?

 If it’s like the XML example, it should always represent a Unicode code 
 point, and *not* be converted to the other character set, no?

Well, you can pick which way you want to do the conversion.  If the
database encoding is SJIS, and there's an SJIS character in a string
that gets passed to json_in(), and there's another string which also
gets passed to json_in() which contains \u, then any sort of
canonicalization or equality testing is going to need to convert the
SJIS character to a Unicode code point, or the Unicode code point to
an SJIS character, to see whether they match.

Err, actually, now that I think about it, that might be a problem:
what happens if we're trying to test two characters for equality and
the encoding conversion fails?  We really just want to return false -
the strings are clearly not equal if either contains even one
character that can't be converted to the other encoding - so it's not
good if an error gets thrown in there anywhere.

 At any rate, since the JSON standard requires UTF-8, such distinctions having 
 to do with alternate encodings are not likely to be covered, so I suspect we 
 can do whatever we want here. It’s outside the spec.

I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Err, actually, now that I think about it, that might be a problem:
 what happens if we're trying to test two characters for equality and
 the encoding conversion fails?

This is surely all entirely doable given the encoding infrastructure
we already have.  We might need some minor refactoring, eg to have
a way of not throwing an error, but it's not going to be that hard
to achieve if somebody wants to do it.  So I still see little reason
for making the JSON type behave visibly differently in non-UTF8 database
encodings.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-20 Thread Robert Haas

On Fri, Jan 20, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Err, actually, now that I think about it, that might be a problem:
 what happens if we're trying to test two characters for equality and
 the encoding conversion fails?

 This is surely all entirely doable given the encoding infrastructure
 we already have.  We might need some minor refactoring, eg to have
 a way of not throwing an error, but it's not going to be that hard
 to achieve if somebody wants to do it.  So I still see little reason
 for making the JSON type behave visibly differently in non-UTF8 database
 encodings.

OK.  It feels a little grotty to me, but I'll go with the flow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Robert Haas

On Sat, Jan 14, 2012 at 3:06 PM, Andrew Dunstan and...@dunslane.net wrote:
 Second, what should be do when the database encoding isn't UTF8? I'm
 inclined to emit a \u escape for any non-ASCII character (assuming it
 has a unicode code point - are there any code points in the non-unicode
 encodings that don't have unicode equivalents?). The alternative would be to
 fail on non-ASCII characters, which might be ugly. Of course, anyone wanting
 to deal with JSON should be using UTF8 anyway, but we still have to deal
 with these things. What about SQL_ASCII? If there's a non-ASCII sequence
 there we really have no way of telling what it should be. There at least I
 think we should probably error out.

I don't see any reason to escape anything more than the minimum
required by the spec, which only requires it for control characters.
If somebody's got a non-ASCII character in there, we can simply allow
it to be represented by itself.  That's almost certainly more compact
(and very possibly more readable) than emitting \u for each such
instance, and it also matches what the current EXPLAIN (FORMAT JSON)
output does.

In other words, let's decree that when the database encoding isn't
UTF-8, *escaping* of non-ASCII characters doesn't work.  But
*unescaped* non-ASCII characters should still work just fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Andrew Dunstan




On 01/19/2012 03:49 PM, Robert Haas wrote:


In other words, let's decree that when the database encoding isn't
UTF-8, *escaping* of non-ASCII characters doesn't work.  But
*unescaped* non-ASCII characters should still work just fine.


The spec only allows unescaped Unicode chars (and for our purposes that 
means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will 
result in something that's not legal JSON. See 
http://www.ietf.org/rfc/rfc4627.txt?number=4627 section 3.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Robert Haas

On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/19/2012 03:49 PM, Robert Haas wrote:
 In other words, let's decree that when the database encoding isn't
 UTF-8, *escaping* of non-ASCII characters doesn't work.  But
 *unescaped* non-ASCII characters should still work just fine.

 The spec only allows unescaped Unicode chars (and for our purposes that
 means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
 result in something that's not legal JSON. See
 http://www.ietf.org/rfc/rfc4627.txt?number=4627 section 3.

I understand.  I'm proposing that we not care.  In other words, if the
server encoding is UTF-8, it'll really be JSON.  But if the server
encoding is something else, it'll be almost-JSON.  And specifically,
the \u syntax won't work, and there might be some non-Unicode
characters in there.  If that's not the behavior you want, then use
UTF-8.

It seems pretty clear that we're going to have to make some trade-off
to handle non-UTF8 encodings, and I think what I'm suggesting is a lot
less painful than disabling high-bit characters altogether.  If we do
that, then what happens if a user runs EXPLAIN (FORMAT JSON) and his
column label has a non-Unicode character in there?  Should we say, oh,
sorry, you can't explain that in JSON format?  That is mighty
unfriendly, and probably mighty complicated and expensive to figure
out, too.  We *do not support* mixing encodings in the same database,
and if we make it the job of this patch to fix that problem, we're
going to be in the same place for 9.2 that we have been for the last
several releases: nowhere.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Andrew Dunstan




On 01/19/2012 04:12 PM, Robert Haas wrote:

On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstanand...@dunslane.net  wrote:

On 01/19/2012 03:49 PM, Robert Haas wrote:

In other words, let's decree that when the database encoding isn't
UTF-8, *escaping* of non-ASCII characters doesn't work.  But
*unescaped* non-ASCII characters should still work just fine.

The spec only allows unescaped Unicode chars (and for our purposes that
means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
result in something that's not legal JSON. See
http://www.ietf.org/rfc/rfc4627.txt?number=4627  section 3.

I understand.  I'm proposing that we not care.  In other words, if the
server encoding is UTF-8, it'll really be JSON.  But if the server
encoding is something else, it'll be almost-JSON.  And specifically,
the \u syntax won't work, and there might be some non-Unicode
characters in there.  If that's not the behavior you want, then use
UTF-8.

It seems pretty clear that we're going to have to make some trade-off
to handle non-UTF8 encodings, and I think what I'm suggesting is a lot
less painful than disabling high-bit characters altogether.  If we do
that, then what happens if a user runs EXPLAIN (FORMAT JSON) and his
column label has a non-Unicode character in there?  Should we say, oh,
sorry, you can't explain that in JSON format?  That is mighty
unfriendly, and probably mighty complicated and expensive to figure
out, too.  We *do not support* mixing encodings in the same database,
and if we make it the job of this patch to fix that problem, we're
going to be in the same place for 9.2 that we have been for the last
several releases: nowhere.



OK, then we need to say that very clearly and up front (including in the 
EXPLAIN docs.)


Of course, for data going to the client, if the client encoding is UTF8, 
they should get legal JSON, regardless of what the database encoding is, 
and conversely too, no?


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Robert Haas

On Thu, Jan 19, 2012 at 5:59 PM, Andrew Dunstan and...@dunslane.net wrote:
 OK, then we need to say that very clearly and up front (including in the
 EXPLAIN docs.)

Can do.

 Of course, for data going to the client, if the client encoding is UTF8,
 they should get legal JSON, regardless of what the database encoding is, and
 conversely too, no?

Well, that would be nice, but I don't think it's practical.  It will
certainly be the case, under the scheme I'm proposing, or probably any
other sensible scheme also, that if a client whose encoding is UTF-8
gets a value of type json back fro the database, it's strictly valid
JSON.  But it won't be possible to store every legal JSON value in the
database if the database encoding is anything other than UTF-8, even
if the client encoding is UTF-8.  The backend will get the client's
UTF-8 bytes and transcode them to the server encoding before calling
the type-input function, so if there are characters in there that
can't be represented in UTF-8, then we'll error out before the JSON
data type ever gets control.  In theory, it would be possible to
accept such strings if the client chooses to represent them using a
\u sequence, but I'm unexcited about doing the work required to
make that happen, because it will still be a pretty half-baked: we'll
be able to accept some representations of the same JSON constant but
not others.

I think the real fix for this problem is to introduce an
infrastructure inside the database that allows us to have different
columns stored in different encodings.  People use bytea for that
right now, but that's pretty unfriendly: it would be nice to have a
better system.  However, I expect that to take a lot of work and break
a lot of things, and until we do it I don't feel that compelled to
provide buggy and incomplete support for it under the guise of
implementing a JSON datatype.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread David E. Wheeler

On Jan 19, 2012, at 4:27 PM, Robert Haas wrote:

 I think the real fix for this problem is to introduce an
 infrastructure inside the database that allows us to have different
 columns stored in different encodings.  People use bytea for that
 right now, but that's pretty unfriendly: it would be nice to have a
 better system.  However, I expect that to take a lot of work and break
 a lot of things, and until we do it I don't feel that compelled to
 provide buggy and incomplete support for it under the guise of
 implementing a JSON datatype.

+1 This seems like a reasonable compromise and course of action, especially if 
someone is interested in taking on column-level encodings at some point in the 
next year or two.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-19 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 On 01/19/2012 04:12 PM, Robert Haas wrote:
 On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstanand...@dunslane.net  wrote:
 The spec only allows unescaped Unicode chars (and for our purposes that
 means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
 result in something that's not legal JSON.

 I understand.  I'm proposing that we not care.  In other words, if the
 server encoding is UTF-8, it'll really be JSON.  But if the server
 encoding is something else, it'll be almost-JSON.

 Of course, for data going to the client, if the client encoding is UTF8, 
 they should get legal JSON, regardless of what the database encoding is, 
 and conversely too, no?

Yes.  I think this argument has been mostly theologizing, along the
lines of how many JSON characters can dance on the head of a pin.
From a user's perspective, the database encoding is only a constraint on
which characters he can store.  He does not know or care what the bit
representation is inside the server.  As such, if we store a non-ASCII
character in a JSON string, it's valid JSON as far as the user is
concerned, so long as that character exists in the Unicode standard.
If his client encoding is UTF8, the value will be letter-perfect JSON
when it gets to him; and if his client encoding is not UTF8, then he's
already pretty much decided that he doesn't give a fig about the
Unicode-centricity of the JSON spec, no?

So I'm with Robert: we should just plain not care.  I would further
suggest that maybe what we should do with incoming JSON escape sequences
is convert them to Unicode code points and then to the equivalent
character in the database encoding (or throw error if there is none).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-15 Thread Andrew Dunstan




On 01/14/2012 03:06 PM, Andrew Dunstan wrote:




OK, here's a patch that does both query_to_json and array_to_json, 
along with docs and regression tests. It include Robert's original 
patch, although I can produce a differential patch if required. It can 
also be pulled from https://bitbucket.org/adunstan/pgdevel






Here's an update that adds row_to_json, plus a bit more cleanup. Example:


   andrew=# SELECT row_to_json(q)
   FROM (SELECT $$a$$ || x AS b,
 y AS c,
 ARRAY[ROW(x.*,ARRAY[1,2,3]),
   ROW(y.*,ARRAY[4,5,6])] AS z
  FROM generate_series(1,2) x,
   generate_series(4,5) y) q;
row_to_json
   
 {b:a1,c:4,z:[{f1:1,f2:[1,2,3]},{f1:4,f2:[4,5,6]}]}
 {b:a1,c:5,z:[{f1:1,f2:[1,2,3]},{f1:5,f2:[4,5,6]}]}
 {b:a2,c:4,z:[{f1:2,f2:[1,2,3]},{f1:4,f2:[4,5,6]}]}
 {b:a2,c:5,z:[{f1:2,f2:[1,2,3]},{f1:5,f2:[4,5,6]}]}
   (4 rows)


(This might be more to Robert's taste than query_to_json() :-) )

cheers

andrew


diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 152ef2f..f45b10b 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -269,6 +269,12 @@
entry/entry
entryXML data/entry
   /row
+
+  row
+   entrytypejson/type/entry
+   entry/entry
+   entryJSON data/entry
+  /row
  /tbody
 /tgroup
/table
@@ -4169,6 +4175,21 @@ SET xmloption TO { DOCUMENT | CONTENT };
/sect2
   /sect1
 
+  sect1 id=datatype-json
+   titleacronymJSON/ Type/title
+
+   indexterm zone=datatype-json
+primaryJSON/primary
+   /indexterm
+
+   para
+The typejson/type data type can be used to store JSON data.  Such
+data can also be stored as typetext/type, but the
+typejson/type data type has the advantage of checking that each
+stored value is a valid JSON value.
+   /para
+  /sect1
+
   array;
 
   rowtypes;
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2e06346..9368739 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9615,6 +9615,77 @@ table2-mapping
   /sect2
  /sect1
 
+ sect1 id=functions-json
+  titleJSON functions/title
+
+  indexterm zone=datatype-json
+	primaryJSON/primary
+	secondaryFunctions and operators/secondary
+  /indexterm
+
+  para
+This section descripbes the functions that are available for creating
+JSON (see xref linkend=datatype-json) data.
+  /para
+
+  table id=functions-json-table
+titleJSON Support Functions/title
+tgroup cols=4
+ thead
+  row
+   entryFunction/entry
+   entryDescription/entry
+   entryExample/entry
+   entryExample Result/entry
+  /row
+ /thead
+ tbody
+  row
+   entry
+ indexterm
+  primaryquery_to_json/primary
+ /indexterm
+ literalquery_to_json(text, boolean)/literal
+   /entry
+   entry
+ Returns the result of running the query as JSON. If the
+ second parameter is true, there will be a line feed between records.
+   /entry
+   entryliteralquery_to_json('select 1 as a, $$foo$$ as b', false)/literal/entry
+   entryliteral[{a:1,b:foo}]/literal/entry
+  /row
+  row
+   entry
+ indexterm
+  primaryarray_to_json/primary
+ /indexterm
+ literalarray_to_json(anyarray)/literal
+   /entry
+   entry
+ Returns the array as JSON. A Postgres multi-dimensional array becomes a JSON
+ array of arrays.
+   /entry
+   entryliteralarray_to_json('{{1,5},{99,100}}'::int[])/literal/entry
+   entryliteral[[1,5],[99,100]]/literal/entry
+  /row
+  row
+   entry
+ indexterm
+  primaryrow_to_json/primary
+ /indexterm
+ literalrow_to_json(record)/literal
+   /entry
+   entry
+ Returns the row as JSON.
+   /entry
+   entryliteralrow_to_json(row(1,'foo'))/literal/entry
+   entryliteral{f1:1,f2:foo}/literal/entry
+  /row
+ /tbody
+/tgroup
+   /table
+
+ /sect1
 
  sect1 id=functions-sequence
   titleSequence Manipulation Functions/title
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8b48105..ddb2784 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -24,6 +24,7 @@
 #include rewrite/rewriteHandler.h
 #include tcop/tcopprot.h
 #include utils/builtins.h
+#include utils/json.h
 #include utils/lsyscache.h
 #include utils/rel.h
 #include utils/snapmgr.h
@@ -99,7 +100,6 @@ static void ExplainDummyGroup(const char *objtype, const char *labelname,
 static void ExplainXMLTag(const char *tagname, int flags, ExplainState *es);
 static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
-static void escape_json(StringInfo buf, const char *str);
 static void escape_yaml(StringInfo buf, const char

Re: [HACKERS] JSON for PG 9.2

2012-01-14 Thread Pavel Stehule

2012/1/14 Andrew Dunstan and...@dunslane.net:


 On 01/12/2012 10:51 AM, Andrew Dunstan wrote:



 On 01/12/2012 10:44 AM, Pavel Stehule wrote:

 2012/1/12 Andrew Dunstanand...@dunslane.net:


 On 01/12/2012 09:00 AM, Joey Adams wrote:

 I wrote an array_to_json function during GSoC 2010:



 http://git.postgresql.org/gitweb/?p=json-datatype.git;a=blob;f=json_io.c#l289

 It's not exposed as a procedure called array_to_json: it's part of the
 to_json function, which decides what to do based on the argument type.


 Excellent, this is just the point at which I stopped work last night, so
 with your permission I'll steal this and it will save me a good chunk of
 time.

 this should be little bit more enhanced to support a row arrays - it
 can be merged with some routines from pst tool
 http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html



 I will be covering composites.



 OK, here's a patch that does both query_to_json and array_to_json, along
 with docs and regression tests. It include Robert's original patch, although
 I can produce a differential patch if required. It can also be pulled from
 https://bitbucket.org/adunstan/pgdevel

 A couple of things to note. First, the problem about us losing column names
 that I noted a couple of months ago and Tom did a bit of work on is
 exercised by this. We really need to fix it. Example:


support SELECT ROW (x AS real name, y AS real name) is good idea
and should be used more time than only here.

Regards

Pavel


   andrew=#  select array_to_json(array_agg(row(z.*)))
       from (select $$a$$ || x as b,

                y as c,
                array[row(x.*,array[1,2,3]),
                      row(y.*,array[4,5,6])] as z
             from generate_series(1,1) x,
                  generate_series(4,4) y) z;
                                  array_to_json
   -
     [{f1:a1,f2:4,f3:[{f1:1,f2:[1,2,3]},{f1:4,f2:[4,5,6]}]}]
   (1 row)


 Here we've lost b, c and z as column names.

 Second, what should be do when the database encoding isn't UTF8? I'm
 inclined to emit a \u escape for any non-ASCII character (assuming it
 has a unicode code point - are there any code points in the non-unicode
 encodings that don't have unicode equivalents?). The alternative would be to
 fail on non-ASCII characters, which might be ugly. Of course, anyone wanting
 to deal with JSON should be using UTF8 anyway, but we still have to deal
 with these things. What about SQL_ASCII? If there's a non-ASCII sequence
 there we really have no way of telling what it should be. There at least I
 think we should probably error out.

 cheers

 andrew




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-14 Thread Joey Adams

On Sat, Jan 14, 2012 at 3:06 PM, Andrew Dunstan and...@dunslane.net wrote:
 Second, what should be do when the database encoding isn't UTF8? I'm
 inclined to emit a \u escape for any non-ASCII character (assuming it
 has a unicode code point - are there any code points in the non-unicode
 encodings that don't have unicode equivalents?). The alternative would be to
 fail on non-ASCII characters, which might be ugly. Of course, anyone wanting
 to deal with JSON should be using UTF8 anyway, but we still have to deal
 with these things. What about SQL_ASCII? If there's a non-ASCII sequence
 there we really have no way of telling what it should be. There at least I
 think we should probably error out.

I don't think there is a satisfying solution to this problem.  Things
working against us:

 * Some server encodings support characters that don't map to Unicode
characters (e.g. unused slots in Windows-1252).  Thus, converting to
UTF-8 and back is lossy in general.

 * We want a normalized representation for comparison.  This will
involve a mixture of server and Unicode characters, unless the
encoding is UTF-8.

 * We can't efficiently convert individual characters to and from
Unicode with the current API.

 * What do we do about \u ?  TEXT datums cannot contain NUL characters.

I'd say just ban Unicode escapes and non-ASCII characters unless the
server encoding is UTF-8, and ban all \u escapes.  It's easy, and
whatever we support later will be a superset of this.

Strategies for handling this situation have been discussed in prior
emails.  This is where things got stuck last time.

- Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-14 Thread Andrew Dunstan




On 01/14/2012 06:11 PM, Joey Adams wrote:

On Sat, Jan 14, 2012 at 3:06 PM, Andrew Dunstanand...@dunslane.net  wrote:

Second, what should be do when the database encoding isn't UTF8? I'm
inclined to emit a \u escape for any non-ASCII character (assuming it
has a unicode code point - are there any code points in the non-unicode
encodings that don't have unicode equivalents?). The alternative would be to
fail on non-ASCII characters, which might be ugly. Of course, anyone wanting
to deal with JSON should be using UTF8 anyway, but we still have to deal
with these things. What about SQL_ASCII? If there's a non-ASCII sequence
there we really have no way of telling what it should be. There at least I
think we should probably error out.

I don't think there is a satisfying solution to this problem.  Things
working against us:

  * Some server encodings support characters that don't map to Unicode
characters (e.g. unused slots in Windows-1252).  Thus, converting to
UTF-8 and back is lossy in general.

  * We want a normalized representation for comparison.  This will
involve a mixture of server and Unicode characters, unless the
encoding is UTF-8.

  * We can't efficiently convert individual characters to and from
Unicode with the current API.

  * What do we do about \u ?  TEXT datums cannot contain NUL characters.

I'd say just ban Unicode escapes and non-ASCII characters unless the
server encoding is UTF-8, and ban all \u escapes.  It's easy, and
whatever we support later will be a superset of this.

Strategies for handling this situation have been discussed in prior
emails.  This is where things got stuck last time.




Well, from where I'm coming from, nuls are not a problem. But 
escape_json() is currently totally encoding-unaware. It produces \u 
escapes for low ascii characters, and just passes through characters 
with the high bit set. That's possibly OK for EXPLAIN output - we really 
don't want don't want EXPLAIN failing. But maybe we should ban JSON 
output for EXPLAIN if the encoding isn't UTF8.


Another question in my mind is what to do when the client encoding isn't 
UTF8.


None of these is an insurmountable problem, ISTM - we just need to make 
some decisions.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-14 Thread Mike Lewis

I am very interested in experimenting with functional indexes into JSON
structures.  I think this could be very powerful combined with full text
search as well as constraints.  It would allow for using postgres as an
unstructured data store without sacrificing the powerful indexing features,
durability, and transactional semantics that comes with using postgres or
RDBMSes in general.

One use case in particular I have been trying to solve for lately is
persisting and synchronizing client-side state (in a mobile application)
with a server.  It would be nice to have a flat, schemaless table (maybe a
table that's like (id, type, owner, data) where data would be a JSON blob).
 I could do this now without JSON support, but I think indexing inside that
JSON blob and having validation database side is valuable as well.  And as
I mentioned before, i'd rather not throw out the baby with the bathwater by
using a different type of database because ACID, replication, and
constraints are also very important to this.  As is being consistent with
the rest of our technology stack. (I'd essentially be using a relational
database to persist an object database)

I'm also not too concerned about storage consumption with this (even though
columnar compression would help a lot in the future) since it's easily
partitionable by user ID.

For my case the equivalent of postgres's XPath would work.  Also having it
as a maintained contrib module would be sufficient, although it being part
of core as XPath is would be even better.


Just my $0.02... even if I'm a bit late to the conversation.

Thanks!
Mike

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Joey Adams

I wrote an array_to_json function during GSoC 2010:


http://git.postgresql.org/gitweb/?p=json-datatype.git;a=blob;f=json_io.c#l289

It's not exposed as a procedure called array_to_json: it's part of the
to_json function, which decides what to do based on the argument type.

- Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Andrew Dunstan




On 01/12/2012 09:00 AM, Joey Adams wrote:

I wrote an array_to_json function during GSoC 2010:

 
http://git.postgresql.org/gitweb/?p=json-datatype.git;a=blob;f=json_io.c#l289

It's not exposed as a procedure called array_to_json: it's part of the
to_json function, which decides what to do based on the argument type.




Excellent, this is just the point at which I stopped work last night, so 
with your permission I'll steal this and it will save me a good chunk of 
time.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Pavel Stehule

2012/1/12 Andrew Dunstan and...@dunslane.net:


 On 01/12/2012 09:00 AM, Joey Adams wrote:

 I wrote an array_to_json function during GSoC 2010:


 http://git.postgresql.org/gitweb/?p=json-datatype.git;a=blob;f=json_io.c#l289

 It's not exposed as a procedure called array_to_json: it's part of the
 to_json function, which decides what to do based on the argument type.



 Excellent, this is just the point at which I stopped work last night, so
 with your permission I'll steal this and it will save me a good chunk of
 time.


this should be little bit more enhanced to support a row arrays - it
can be merged with some routines from pst tool
http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html

Regards

Pavel

 cheers

 andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Andrew Dunstan




On 01/12/2012 10:44 AM, Pavel Stehule wrote:

2012/1/12 Andrew Dunstanand...@dunslane.net:


On 01/12/2012 09:00 AM, Joey Adams wrote:

I wrote an array_to_json function during GSoC 2010:


http://git.postgresql.org/gitweb/?p=json-datatype.git;a=blob;f=json_io.c#l289

It's not exposed as a procedure called array_to_json: it's part of the
to_json function, which decides what to do based on the argument type.



Excellent, this is just the point at which I stopped work last night, so
with your permission I'll steal this and it will save me a good chunk of
time.


this should be little bit more enhanced to support a row arrays - it
can be merged with some routines from pst tool
http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html




I will be covering composites.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Merlin Moncure

On Thu, Jan 12, 2012 at 9:51 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/12/2012 10:44 AM, Pavel Stehule wrote:
 this should be little bit more enhanced to support a row arrays - it
 can be merged with some routines from pst tool
 http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html

 I will be covering composites.

curious, will your function work on unregistered composites?  What
would this do?

select array_to_json(array[row('a', 1), row('b', 2)]);

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Pavel Stehule

2012/1/12 Merlin Moncure mmonc...@gmail.com:
 On Thu, Jan 12, 2012 at 9:51 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/12/2012 10:44 AM, Pavel Stehule wrote:
 this should be little bit more enhanced to support a row arrays - it
 can be merged with some routines from pst tool
 http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html

 I will be covering composites.

 curious, will your function work on unregistered composites?  What
 would this do?

 select array_to_json(array[row('a', 1), row('b', 2)]);

it can do it - but it has to use some defaults for names.

Pavel


 merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-12 Thread Andrew Dunstan




On 01/12/2012 11:21 AM, Merlin Moncure wrote:

On Thu, Jan 12, 2012 at 9:51 AM, Andrew Dunstanand...@dunslane.net  wrote:

On 01/12/2012 10:44 AM, Pavel Stehule wrote:

this should be little bit more enhanced to support a row arrays - it
can be merged with some routines from pst tool
http://okbob.blogspot.com/2010/11/new-version-of-pst-collection-is.html

I will be covering composites.

curious, will your function work on unregistered composites?  What
would this do?

select array_to_json(array[row('a', 1), row('b', 2)]);




Expected behaviour is something like this:

   andrew=# select q2json('
  select $$a$$ || x as b,
 y as c,
 array[row(x.*,array[1,2,3]),
   row(y.*,array[4,5,6])] as z
  from generate_series(1,2) x,
   generate_series(4,5) y');

 [{b:a1,c:4,z:[{f1:1,f2:[1,2,3]},{f1:4,f2:[4,5,6]}]},
  {b:a1,c:5,z:[{f1:1,f2:[1,2,3]},{f1:5,f2:[4,5,6]}]},
  {b:a2,c:4,z:[{f1:2,f2:[1,2,3]},{f1:4,f2:[4,5,6]}]},
  {b:a2,c:5,z:[{f1:2,f2:[1,2,3]},{f1:5,f2:[4,5,6]}]}]

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Andrew Dunstan




On 01/11/2012 01:18 AM, Pavel Stehule wrote:


I like this patch and this feature.


I'm about to read the patch in detail - I certainly like the feature.



I see only one issue - there is not functionality that helps generate
JSON in pg.

What do you think about functions: array_to_json(anyarray),
row_to_json(any) and format_json(text, text, ...)



Actually, more than these, I (and at least one very interested client) 
want query_to_json, which would do something like:


   # select q2json('select $$a$$ || x as b, y as c from generate_series(1,3) x, 
generate_series(4,5) y');
 q2json
   
-
 
[{b:a1,c:4},{b:a1,c:5},{b:a2,c:4},{b:a2,c:5},{b:a3,c:4},{b:a3,c:5}]


No doubt several variants are possible such as returning a setof json, 
one per row, instead of a single json, and allowing query parameters as 
separate arguments (maybe just using variadic functions), but probably 
for a first go just something as simple as this would meet the case.


Given the short time span available before patches must be in, I am 
prepared to work on this ASAP.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Pavel Stehule

2012/1/11 Andrew Dunstan and...@dunslane.net:


 On 01/11/2012 01:18 AM, Pavel Stehule wrote:


 I like this patch and this feature.


 I'm about to read the patch in detail - I certainly like the feature.



 I see only one issue - there is not functionality that helps generate
 JSON in pg.

 What do you think about functions: array_to_json(anyarray),
 row_to_json(any) and format_json(text, text, ...)


 Actually, more than these, I (and at least one very interested client) want
 query_to_json, which would do something like:

   # select q2json('select $$a$$ || x as b, y as c from generate_series(1,3)
 x, generate_series(4,5) y');
                                                     q2json

 -

 [{b:a1,c:4},{b:a1,c:5},{b:a2,c:4},{b:a2,c:5},{b:a3,c:4},{b:a3,c:5}]


we have a query_to_xml - so there should similar query_to_json. But
this is not enough for usage from SP. What about two rich functions

* query_to_json - by your proposal
* array_to_json - with possibility to serialize array of records

This can be a basic set

Regards

Pavel





 No doubt several variants are possible such as returning a setof json, one
 per row, instead of a single json, and allowing query parameters as separate
 arguments (maybe just using variadic functions), but probably for a first go
 just something as simple as this would meet the case.



 Given the short time span available before patches must be in, I am prepared
 to work on this ASAP.

 cheers

 andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Robert Haas

On Wed, Jan 11, 2012 at 1:18 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2012/1/11 Robert Haas robertmh...@gmail.com:
 On Tue, Dec 20, 2011 at 9:06 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 On Dec 20, 2011, at 10:39 AM, Claes Jakobsson wrote:
 Are people explicitly asking for a) *JSON* datatype or b) a type that lets 
 you store arbitrary complex semi-untyped data structures?

 Yes.

 if b) then this might get a lot more interesting

 JSON is the most popular/likely way to represent that, I think.

 On that note, here's an updated version of the patch I posted
 upthread, with some regression tests and minimal documentation.

 I like this patch and this feature.

 I see only one issue - there is not functionality that helps generate
 JSON in pg.

 What do you think about functions: array_to_json(anyarray),
 row_to_json(any) and format_json(text, text, ...)

I think we might want all of that stuff, but I doubt there is time to
do it for 9.2.

Actually, I think the next logical step would be to define equality
(is there an official definition of that for JSON?) and build a btree
opclass.  I believe the code I've already written could be extended to
construct an abstract syntax tree for those operations that need it.
But we need to make some decisions first.  A btree opclass requires a
total ordering, so we have to arbitrarily define whether 1  true, 1 
[1], 1  1, etc.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Pavel Stehule

2012/1/11 Robert Haas robertmh...@gmail.com:
 On Wed, Jan 11, 2012 at 1:18 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2012/1/11 Robert Haas robertmh...@gmail.com:
 On Tue, Dec 20, 2011 at 9:06 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 On Dec 20, 2011, at 10:39 AM, Claes Jakobsson wrote:
 Are people explicitly asking for a) *JSON* datatype or b) a type that 
 lets you store arbitrary complex semi-untyped data structures?

 Yes.

 if b) then this might get a lot more interesting

 JSON is the most popular/likely way to represent that, I think.

 On that note, here's an updated version of the patch I posted
 upthread, with some regression tests and minimal documentation.

 I like this patch and this feature.

 I see only one issue - there is not functionality that helps generate
 JSON in pg.

 What do you think about functions: array_to_json(anyarray),
 row_to_json(any) and format_json(text, text, ...)

 I think we might want all of that stuff, but I doubt there is time to
 do it for 9.2.

 Actually, I think the next logical step would be to define equality
 (is there an official definition of that for JSON?) and build a btree
 opclass.  I believe the code I've already written could be extended to
 construct an abstract syntax tree for those operations that need it.
 But we need to make some decisions first.  A btree opclass requires a
 total ordering, so we have to arbitrarily define whether 1  true, 1 
 [1], 1  1, etc.


I don't understand why we have to do it?

We don't support similar functionality for XML, so why for JSON?

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Robert Haas

On Wed, Jan 11, 2012 at 8:38 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I don't understand why we have to do it?

 We don't support similar functionality for XML, so why for JSON?

Hrm.  Well, that's an interesting point.  Maybe we don't.  I assumed
that people would eventually want to optimize queries of the form
SELECT whatever FROM tab WHERE jsoncol = 'constant'.  If that's a
sufficiently marginal use case that we don't care, then fine.

One difference between JSON and XML is that XML really has no
well-defined comparison semantics.  For example, consider:

foobar1.0/bar/foo
foobar1.0/bar /foo

If the XML is being used as a transport mechanism, then the extra
space is semantically insignificant, but if this is markup, then it
might matter a lot.  Also, consider:

foobar1.00/bar/foo

That one might be equal if we think 1.0 is intended to be a number,
but if it's intended as a string then it's not.  We could perhaps do
comparisons in XML relative to some DTD or schema if those provide
details about what the values mean, but in a vacuum it's not
well-defined.  On the other hand, in JSON, it's pretty clear that { 1,
2, 3 } is the same value as {1,2,3} but 1,2,3 is different from 1,
2, 3.  There are some borderline cases that might need some sweat,
like whether 1 = 1.0 = 1.00 = 1e0, but in general the level of
ambiguity seems to me to be much less, making it more feasible here
than it would be for XML.

That having been said, uncertainties about whether we want this at all
(and if so in what form) are exactly why I didn't include this kind of
stuff in the patch to begin with, and I think that if we get this much
committed for 9.2 we'll be doing pretty well.  If we can agree on and
do more, great; if not, we'll at least have this much, which IMHO
would be an improvement over what we have now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Pavel Stehule

2012/1/11 Robert Haas robertmh...@gmail.com:
 On Wed, Jan 11, 2012 at 8:38 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I don't understand why we have to do it?

 We don't support similar functionality for XML, so why for JSON?

 Hrm.  Well, that's an interesting point.  Maybe we don't.  I assumed
 that people would eventually want to optimize queries of the form
 SELECT whatever FROM tab WHERE jsoncol = 'constant'.  If that's a
 sufficiently marginal use case that we don't care, then fine.

 One difference between JSON and XML is that XML really has no
 well-defined comparison semantics.  For example, consider:

 foobar1.0/bar/foo
 foobar1.0/bar /foo

 If the XML is being used as a transport mechanism, then the extra
 space is semantically insignificant, but if this is markup, then it
 might matter a lot.  Also, consider:

 foobar1.00/bar/foo

 That one might be equal if we think 1.0 is intended to be a number,
 but if it's intended as a string then it's not.  We could perhaps do
 comparisons in XML relative to some DTD or schema if those provide
 details about what the values mean, but in a vacuum it's not
 well-defined.  On the other hand, in JSON, it's pretty clear that { 1,
 2, 3 } is the same value as {1,2,3} but 1,2,3 is different from 1,
 2, 3.  There are some borderline cases that might need some sweat,
 like whether 1 = 1.0 = 1.00 = 1e0, but in general the level of
 ambiguity seems to me to be much less, making it more feasible here
 than it would be for XML.

 That having been said, uncertainties about whether we want this at all
 (and if so in what form) are exactly why I didn't include this kind of
 stuff in the patch to begin with, and I think that if we get this much
 committed for 9.2 we'll be doing pretty well.  If we can agree on and
 do more, great; if not, we'll at least have this much, which IMHO
 would be an improvement over what we have now.


I understand it now. My opinion is so some operators and index search
can be in 9.2 - so use a JSON just as communication format now.

* we need to build JSON
* we need to check if some is valid JSON
* we need to store JSON

other steps should be (9.2)
* basic operators eq, neeq
* some tool like XQuery - simple query on JSON document available from
SQL that can be used for functional indexes.




 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Robert Haas

On Wed, Jan 11, 2012 at 9:41 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I understand it now. My opinion is so some operators and index search
 can be in 9.2 - so use a JSON just as communication format now.

 * we need to build JSON
 * we need to check if some is valid JSON
 * we need to store JSON

 other steps should be (9.2)
 * basic operators eq, neeq
 * some tool like XQuery - simple query on JSON document available from
 SQL that can be used for functional indexes.

That would be nice, but let's not let the perfect be the enemy of the
good.  We don't have a lot of time here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Pavel Stehule

2012/1/11 Robert Haas robertmh...@gmail.com:
 On Wed, Jan 11, 2012 at 9:41 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I understand it now. My opinion is so some operators and index search
 can be in 9.2 - so use a JSON just as communication format now.

 * we need to build JSON
 * we need to check if some is valid JSON
 * we need to store JSON

 other steps should be (9.2)
 * basic operators eq, neeq
 * some tool like XQuery - simple query on JSON document available from
 SQL that can be used for functional indexes.

 That would be nice, but let's not let the perfect be the enemy of the
 good.  We don't have a lot of time here.


sorry - replace 9.2 by 9.3 - I am sorry

I am able to write array_to_json fce and Andrew can write query_to_json

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Misa Simic


I am able to write array_to_json fce and Andrew can write query_to_json

+1

Thanks guys...

We are using a lot of JSON as communication protocol...

having core support for JSON, And those functions, will be a real life
saver...

Many thanks,

Misa

Sent from my Windows Phone
From: Pavel Stehule
Sent: 11/01/2012 16:22
To: Robert Haas
Cc: David E. Wheeler; Claes Jakobsson; Dimitri Fontaine; Merlin
Moncure; Andrew Dunstan; Magnus Hagander; Jan Urbański; Simon Riggs;
Joey Adams; Bruce Momjian; PostgreSQL-development Hackers; Jan Wieck
Subject: Re: [HACKERS] JSON for PG 9.2
2012/1/11 Robert Haas robertmh...@gmail.com:
 On Wed, Jan 11, 2012 at 9:41 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I understand it now. My opinion is so some operators and index search
 can be in 9.2 - so use a JSON just as communication format now.

 * we need to build JSON
 * we need to check if some is valid JSON
 * we need to store JSON

 other steps should be (9.2)
 * basic operators eq, neeq
 * some tool like XQuery - simple query on JSON document available from
 SQL that can be used for functional indexes.

 That would be nice, but let's not let the perfect be the enemy of the
 good.  We don't have a lot of time here.


sorry - replace 9.2 by 9.3 - I am sorry

I am able to write array_to_json fce and Andrew can write query_to_json

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-11 Thread Andrew Dunstan




On 01/11/2012 10:21 AM, Pavel Stehule wrote:

2012/1/11 Robert Haasrobertmh...@gmail.com:

On Wed, Jan 11, 2012 at 9:41 AM, Pavel Stehulepavel.steh...@gmail.com  wrote:

I understand it now. My opinion is so some operators and index search
can be in 9.2 - so use a JSON just as communication format now.

* we need to build JSON
* we need to check if some is valid JSON
* we need to store JSON

other steps should be (9.2)
* basic operators eq, neeq
* some tool like XQuery - simple query on JSON document available from
SQL that can be used for functional indexes.

That would be nice, but let's not let the perfect be the enemy of the
good.  We don't have a lot of time here.


sorry - replace 9.2 by 9.3 - I am sorry

I am able to write array_to_json fce and Andrew can write query_to_json



For those who want to play along, see 
https://bitbucket.org/adunstan/pgdevel which has Robert's patch and my 
additions to it.


I'm actually half way through writing an array_to_json function, since 
it it necessary anyway for query_to_json. I hope to have a fairly 
complete working function in about 24 hours.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-10 Thread Robert Haas

On Tue, Dec 20, 2011 at 9:06 PM, David E. Wheeler da...@kineticode.com wrote:
 On Dec 20, 2011, at 10:39 AM, Claes Jakobsson wrote:
 Are people explicitly asking for a) *JSON* datatype or b) a type that lets 
 you store arbitrary complex semi-untyped data structures?

 Yes.

 if b) then this might get a lot more interesting

 JSON is the most popular/likely way to represent that, I think.

On that note, here's an updated version of the patch I posted
upthread, with some regression tests and minimal documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


json-rmh-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2012-01-10 Thread Pavel Stehule

2012/1/11 Robert Haas robertmh...@gmail.com:
 On Tue, Dec 20, 2011 at 9:06 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 On Dec 20, 2011, at 10:39 AM, Claes Jakobsson wrote:
 Are people explicitly asking for a) *JSON* datatype or b) a type that lets 
 you store arbitrary complex semi-untyped data structures?

 Yes.

 if b) then this might get a lot more interesting

 JSON is the most popular/likely way to represent that, I think.

 On that note, here's an updated version of the patch I posted
 upthread, with some regression tests and minimal documentation.

I like this patch and this feature.

I see only one issue - there is not functionality that helps generate
JSON in pg.

What do you think about functions: array_to_json(anyarray),
row_to_json(any) and format_json(text, text, ...)

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-22 Thread Benedikt Grundmann

Let me mention another lightweight data-interchange format.

At http://www.janestreet.com we have developed a small c module to deal
with S-expressions (sexp) as a way to store arbitrary data.  As we write
most of our code in OCaml sexps are a natural way for us to store data.
http://hg.ocaml.info/release/sexplib/ provides automatic ways to convert
any ocaml value into a sexp).

The extension is still pretty new but we use it successfully on a daily
basis.  After we have upgraded to 9.x we will pack it as an extension 
and releast it opensource.

API wise the module at the moment offers the following:

sexp_validate(text) returns boolean
Validate that the passed in text is a valid s expression.

create domain sexp as text check (sexp_validate(value));

BTW: It is a PITA that arrays of domains are not valid types.

And several functions to manipulate take apart sexp's or modify sexp's
using a path into the sexp (similar to what xpath does for xml).

Such as:

sexp_get(sexp, text) returns sexp
Get the sub sexp of sexp identified by the path.  
Returns NULL if path is not a valid path in sexp.
Example:
   path=.a space.b.[1].x
   ((ignore this) (a space ((b (0 ((also ignored) (x The Value)) )) )))
- The Value

And sexp_get_atom(sexp, text) returns text
Get the sub atom of sexp identified by the path.  
Returns NULL if path is not a valid path in sexp or
does not identify an atom.
Example:
   path=.a space.b.[1].x
   ((ignore this) (a space ((b (0 ((also ignored) (x The Value)) )) )))
^
- The Value

Cheers,

Bene

On 20/12/11 19:39, Claes Jakobsson wrote:
 On Dec 20, 2011, at 12:39 AM, David E. Wheeler wrote:
  On Dec 19, 2011, at 2:49 AM, Dimitri Fontaine wrote:
  
  My understanding is that JSON is a subset of ECMAscript
  
  Well, no, JSON is formally “a lightweight data-interchange format.” It’s 
  derived from JavaScript syntax, but it is not a programming language, so I 
  wouldn’t say it was accurate to describe it as a subset of JS or ECMAScript.
  
   http://json.org/
 
 Are people explicitly asking for a) *JSON* datatype or b) a type that lets 
 you store arbitrary complex semi-untyped data structures?
 
 if b) then this might get a lot more interesting
 
 Cheers,
 Claes
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Dimitri Fontaine

David E. Wheeler da...@justatheory.com writes:
 holds: it is not a programming language, and one does not need a PL to have
 a JSON data type.

Exactly.  That does not contradict the fact that if you have
pl/ecmascript you already have JSON.  And that we might as well have had
the ecmascript PL for some time now, we just need to check about that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Andres Freund

On Tuesday, December 20, 2011 07:23:43 PM Dimitri Fontaine wrote:
 David E. Wheeler da...@justatheory.com writes:
  holds: it is not a programming language, and one does not need a PL to
  have a JSON data type.
 Exactly.  That does not contradict the fact that if you have
 pl/ecmascript you already have JSON.  And that we might as well have had
 the ecmascript PL for some time now, we just need to check about that.
Not really. You need to be able to evaluate json without it possibly 
executing code. Many js implementations are likely to have such a feature 
though.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Claes Jakobsson

On Dec 20, 2011, at 12:39 AM, David E. Wheeler wrote:
 On Dec 19, 2011, at 2:49 AM, Dimitri Fontaine wrote:
 
 My understanding is that JSON is a subset of ECMAscript
 
 Well, no, JSON is formally “a lightweight data-interchange format.” It’s 
 derived from JavaScript syntax, but it is not a programming language, so I 
 wouldn’t say it was accurate to describe it as a subset of JS or ECMAScript.
 
  http://json.org/

Are people explicitly asking for a) *JSON* datatype or b) a type that lets you 
store arbitrary complex semi-untyped data structures?

if b) then this might get a lot more interesting

Cheers,
Claes
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Christopher Browne

On Mon, Dec 19, 2011 at 5:49 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 My understanding is that JSON is a subset of ECMAscript, so if you get
 the latter you already have the former.  Now, someone would have to
 check if plscheme still build with guile-2.0, and given that, how
 exactly you get pl/ecmascript (or pl/js) out of that.

I don't think so.

I checked it out (still on pgfoundry, still on CVS, and code hasn't
been touched since 2008), and run into some issues.

- It looks for libguile.h
  #include libguile.h
which, on 2.0, has shifted around from /usr/include/libguile.h (1.8)
to /usr/include/guile/2.0/libguile.h

It's not doing enough indirections internally; there is a guile-config
that is analogous to pg_config

postgres@cbbrowne [03:48:43] [~/PostgreSQL/plscheme]
- % guile-config compile
-pthread -I/usr/include/guile/2.0
postgres@cbbrowne [03:48:45] [~/PostgreSQL/plscheme]
- % guile-config link
-lguile-2.0 -lgc

It looks like there's something PG-related as a next issue:

- % ./install.sh
pg_config : /var/lib/postgresql/dbs/postgresql-HEAD/bin/pg_config
module-dir: /var/lib/postgresql/dbs/postgresql-HEAD/lib
max-cache-size: 64
dbname: postgres
safe-r5rs : NO
dbacreate : NO
PSQL  : /var/lib/postgresql/dbs/postgresql-HEAD/bin/psql  postgres
CPPFLAGS  : -g -Wall -fpic -c
-I/var/lib/postgresql/dbs/postgresql-HEAD/include/server
-I/usr/include/guile/2.0
LDFLAGS   : -shared -lguile

Compiling... failed!

plscheme.c: In function '_PG_init':
plscheme.c:647:2: warning: implicit declaration of function
'DefineCustomStringVariable' [-Wimplicit-function-declaration]
plscheme.c:650:30: error: 'PGC_BACKEND' undeclared (first use in this function)
plscheme.c:650:30: note: each undeclared identifier is reported only
once for each function it appears in
plscheme.c:652:2: warning: implicit declaration of function
'DefineCustomIntVariable' [-Wimplicit-function-declaration]
plscheme.c: In function 'plscheme_func_handler':
plscheme.c:742:2: warning: implicit declaration of function
'GetTopTransactionId' [-Wimplicit-function-declaration]
plscheme.c: In function 'parse_trig_args':
plscheme.c:1623:44: error: dereferencing pointer to incomplete type
plscheme.c:1628:38: error: dereferencing pointer to incomplete type
...
(See error.log for details.)

I'm not sure to what degree this is bitrot relating to:
a) Postgres changes
b) Guile changes
but there's doubtless a bit of both.

I'd think it interesting to get this back to working order, whether
it's useful for JavaScript or not.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread David E. Wheeler

On Dec 19, 2011, at 9:00 PM, Robert Haas wrote:

 +1, though I think the core type will at least need some basic operators and 
 indexing support.
 
 And I'm willing to do that, but I thought it best to submit a bare
 bones patch first, in the hopes of minimizing the number of
 objectionable things therein.  For example, if you want to be able to
 index a JSON column, you have to decide on some collation order that
 is consistent with JSON's notion of equality, and it's not obvious
 what is most logical.  Heck, equality itself isn't 100% obvious.  If
 there's adequate support for including JSON in core, and nobody
 objects to my implementation, then I'll throw some ideas for those
 things up against the wall and see what sticks.

+1 Sounds good to me.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread David E. Wheeler

On Dec 20, 2011, at 2:13 AM, Magnus Hagander wrote:

 Yes, that's why I said few not none.
 
 Though in my experience, most companies are a lot more restrictive
 about addons to their database than addons to their development
 environments.

Yeah, we’re getting off-topic here, so I’ll just say something we can agree on: 
We’ll see.

I do still want to see some processes for getting PGXN distributions into 
RPM/.deb/StackBuilder, though.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread David E. Wheeler

On Dec 20, 2011, at 10:39 AM, Claes Jakobsson wrote:

 Are people explicitly asking for a) *JSON* datatype or b) a type that lets 
 you store arbitrary complex semi-untyped data structures?

Yes.

 if b) then this might get a lot more interesting

JSON is the most popular/likely way to represent that, I think.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Magnus Hagander

On Tue, Dec 20, 2011 at 00:26, David E. Wheeler da...@kineticode.com wrote:
 On Dec 18, 2011, at 4:41 AM, Magnus Hagander wrote:

 We can hopefully get around this for the extensions in contrib (and
 reasonably well has already), but few large companies are going to be
 happy to go to pgxn and download an extension that has a single
 maintainer (not the team, and in most cases not even a team),
 usually no defined lifecycle, no support, etc. (I'm pretty sure you
 won't get support included for random pgxn modules when you buy a
 contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
 if it the datatype is in core, you *will* get this)

 I support having a JSON type in core, but question the assertions here. 
 *Some* organizations won’t use PGXN, usually because they require things 
 through a different ecosystem (RPMs, .debs, StackBuilder, etc.). But many 
 others will. There are a *lot* of companies out there that use CPAN, 
 easy_install, and Gem. The same sorts of places will use PGXN.

Yes, that's why I said few not none.

Though in my experience, most companies are a lot more restrictive
about addons to their database than addons to their development
environments.

And note that it's not PGXN that's the problem I'm pointing at,
neither is it CPAN or easy_install or gem. The problem is the
vulnerability of the addon, and the maintenance. Meaning if it has a
single maintainer, that's a whole different thing from being
maintained by the PGDG.


 Oh, and at PGX, we’ll happily provide support for random modules, so long as 
 you pay for our time. We’re not picky (and happy to send improvements back 
 upstream), though we might recommend you switch to something better. But such 
 evaluations are based on quality, not simply on what ecosystem it came from.

I think we're talking about different things here. While we can
certainly provide support on specific modules, after that is entered
into the agreement with the customer, we won't support a customer who
just calls up and says hey, I'm using module xyz which you've never
heard of, and it crashes my database, please come fix it now. Are you
saying you do that - providing SLAs, 24/7 and similar things, on
modules you didn't even know the customer was using?

And FWIW, I'm talking about the quality, and not the ecosystem as
well. I'm just saying it takes a lot more work to verify the quality
and maintenance of an external module - if it's part of postgresql,
you have *already* got a quality stamp and a maintenance promise from
that.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-20 Thread Magnus Hagander

On Tue, Dec 20, 2011 at 06:00, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 19, 2011 at 6:26 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 +1, though I think the core type will at least need some basic operators and 
 indexing support.

 And I'm willing to do that, but I thought it best to submit a bare
 bones patch first, in the hopes of minimizing the number of
 objectionable things therein.  For example, if you want to be able to
 index a JSON column, you have to decide on some collation order that
 is consistent with JSON's notion of equality, and it's not obvious
 what is most logical.  Heck, equality itself isn't 100% obvious.  If
 there's adequate support for including JSON in core, and nobody
 objects to my implementation, then I'll throw some ideas for those
 things up against the wall and see what sticks.

+1 for getting the basics in first, and then adding more to it later.
There's still a fair amount of time to do that for 9.2, but not if we
get stuck bikeshedding again...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-19 Thread Dimitri Fontaine

Merlin Moncure mmonc...@gmail.com writes:
 Getting back on point, I'm curious about your statement: without
 writing a single line of C.  I took a look at the pl/scheme docs and
 was pretty impressed -- what exactly would be involved to get a
 guile-based ECMAscript working over the pl/scheme implementation?  How
 would that interact exactly with the stated topic -- JSON support?  Do
 you even need a json type if you have strong library based parsing and
 composition features?

My understanding is that JSON is a subset of ECMAscript, so if you get
the latter you already have the former.  Now, someone would have to
check if plscheme still build with guile-2.0, and given that, how
exactly you get pl/ecmascript (or pl/js) out of that.

I won't be in a position to spend time on that this year…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-19 Thread David E. Wheeler

On Dec 18, 2011, at 4:41 AM, Magnus Hagander wrote:

 We can hopefully get around this for the extensions in contrib (and
 reasonably well has already), but few large companies are going to be
 happy to go to pgxn and download an extension that has a single
 maintainer (not the team, and in most cases not even a team),
 usually no defined lifecycle, no support, etc. (I'm pretty sure you
 won't get support included for random pgxn modules when you buy a
 contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
 if it the datatype is in core, you *will* get this)

I support having a JSON type in core, but question the assertions here. *Some* 
organizations won’t use PGXN, usually because they require things through a 
different ecosystem (RPMs, .debs, StackBuilder, etc.). But many others will. 
There are a *lot* of companies out there that use CPAN, easy_install, and Gem. 
The same sorts of places will use PGXN.

Oh, and at PGX, we’ll happily provide support for random modules, so long as 
you pay for our time. We’re not picky (and happy to send improvements back 
upstream), though we might recommend you switch to something better. But such 
evaluations are based on quality, not simply on what ecosystem it came from.

 If we can find a way to have a stable part in core and then have
 addons that can provide these tons of interesting features (which I
 agree there are) until such time that they can be considered stable
 enough for core, I think that's the best compromise.

+1, though I think the core type will at least need some basic operators and 
indexing support.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-19 Thread David E. Wheeler

On Dec 19, 2011, at 3:39 PM, David E. Wheeler wrote:

 Well, no, JSON is formally “a lightweight data-interchange format.” It’s 
 derived from JavaScript syntax, but it is not a programming language, so I 
 wouldn’t say it was accurate to describe it as a subset of JS or ECMAScript.

Bah, it says “It is based on a subset of the JavaScript Programming Language, 
Standard ECMA-262 3rd Edition - December 1999.” But my point still holds: it is 
not a programming language, and one does not need a PL to have a JSON data type.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-19 Thread David E. Wheeler

On Dec 19, 2011, at 2:49 AM, Dimitri Fontaine wrote:

 My understanding is that JSON is a subset of ECMAscript

Well, no, JSON is formally “a lightweight data-interchange format.” It’s 
derived from JavaScript syntax, but it is not a programming language, so I 
wouldn’t say it was accurate to describe it as a subset of JS or ECMAScript.

  http://json.org/

IOW, one does not need a new PL to get this type.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-19 Thread Robert Haas

On Mon, Dec 19, 2011 at 6:26 PM, David E. Wheeler da...@kineticode.com wrote:
 +1, though I think the core type will at least need some basic operators and 
 indexing support.

And I'm willing to do that, but I thought it best to submit a bare
bones patch first, in the hopes of minimizing the number of
objectionable things therein.  For example, if you want to be able to
index a JSON column, you have to decide on some collation order that
is consistent with JSON's notion of equality, and it's not obvious
what is most logical.  Heck, equality itself isn't 100% obvious.  If
there's adequate support for including JSON in core, and nobody
objects to my implementation, then I'll throw some ideas for those
things up against the wall and see what sticks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Jan Urbański

On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.
 
 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

I think having a JSON data type in core would drastically limit the
exposure third-party JSON extensions would get and that's bad. There are
tons of interesting features a JSON type could have and tying its
development to a one year release cycle might be a disservice both for
people who are willing to provide these features earlier, the users
which are faced with a choice between a fast-moving third-party addon
and a blessed core type and would cause overall confusion.

How about we try the tsearch way and let JSON extensions live outside
core for some time and perhaps if one emerges dominant and would benefit
from inclusion then consider it?

If we keep treating extensions as second-class citizens, they'll never
get the mindshare and importance we seem to want for them (or otherwise
why go through all the trouble to provide an infrastructure for them).

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Pavel Stehule

2011/12/18 Jan Urbański wulc...@wulczer.org:
 On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.

 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

 I think having a JSON data type in core would drastically limit the
 exposure third-party JSON extensions would get and that's bad. There are
 tons of interesting features a JSON type could have and tying its
 development to a one year release cycle might be a disservice both for
 people who are willing to provide these features earlier, the users
 which are faced with a choice between a fast-moving third-party addon
 and a blessed core type and would cause overall confusion.

 How about we try the tsearch way and let JSON extensions live outside
 core for some time and perhaps if one emerges dominant and would benefit
 from inclusion then consider it?

it should be contrib modules

Pavel


 If we keep treating extensions as second-class citizens, they'll never
 get the mindshare and importance we seem to want for them (or otherwise
 why go through all the trouble to provide an infrastructure for them).

 Cheers,
 Jan

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 17, 2011 at 5:02 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 I'd like to add some confusion on the implementation choice, because it
 looks damn too easy now… Guile 2.0 offers an implementation of the
 ECMAscript language and plscheme already exists as a PostgreSQL PL
 extension for integrating with Guile.

 It seems like the licensing there could potentially be problematic.
 It's GPL with a linking exception.  Not sure we want to go there.

It's LGPL so it's compatible (only the readline part is subject to GPL,
we're familiar enough with that though).

  http://www.gnu.org/software/guile/docs/docs-2.0/guile-ref/Guile-License.html

  The Guile library (libguile) and supporting files are published under
  the terms of the GNU Lesser General Public License version 3 or later.
  See the files COPYING.LESSER and COPYING.

  C code linking to the Guile library is subject to terms of that
  library. Basically such code may be published on any terms, provided
  users can re-link against a new or modified version of Guile.

  Scheme level code written to be run by Guile (but not derived from
  Guile itself) is not restricted in any way, and may be published on
  any terms. We encourage authors to publish on Free terms.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Magnus Hagander

On Sun, Dec 18, 2011 at 10:49, Jan Urbański wulc...@wulczer.org wrote:
 On 18/12/11 04:21, Robert Haas wrote:
 On Sat, Dec 17, 2011 at 7:50 PM, David E. Wheeler da...@kineticode.com 
 wrote:
 Love having the start here. I forwarded this message to Claes Jakobsson, 
 creator of the jansson-using pg-json extension. He’s a bit less supportive. 
 He gave me permission to quote him here:

 Frankly I see the inclusion of a JSON datatype in core as unnecessary. 
 Stuff should be moved out of core rather than in, as we do in Perl. Also, 
 does this patch mean that the 'json' type is forever claimed and can't be 
 replaced by extensions?

 There's little reason to reimplement JSON parsing, comparision and other 
 routines when there's a multitude of already good libraries.

 That's fair enough, but we've had *many* requests for this
 functionality in core, I don't see what we lose by having at least
 some basic functionality built in.

 I think having a JSON data type in core would drastically limit the
 exposure third-party JSON extensions would get and that's bad. There are

The same way that having replication in core is bad for the rest of
the replication engines? While it has certainly decreased the usage of
for example Slony, I don't think anybody can say it's a bad thing that
we have this in core...

And of course, *not* having it in core, we didn't have people claiming
for many years that postgres has no replication or anything like
that... The fact is that a *lot* of our users, particularly in large
companies, will never install an extension that's not part of core.
Just look at other discussions about it even being a problem with it
being in *contrib*, which is still maintained and distributed by the
same developers.

We can hopefully get around this for the extensions in contrib (and
reasonably well has already), but few large companies are going to be
happy to go to pgxn and download an extension that has a single
maintainer (not the team, and in most cases not even a team),
usually no defined lifecycle, no support, etc. (I'm pretty sure you
won't get support included for random pgxn modules when you buy a
contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
if it the datatype is in core, you *will* get this)

So I'm not sure it would really lessen the exposure much at all -
those that are willing to install such extensions already, are surely
capable of finding it themselves (using pgxn for example - or even
google)


 tons of interesting features a JSON type could have and tying its
 development to a one year release cycle might be a disservice both for
 people who are willing to provide these features earlier, the users
 which are faced with a choice between a fast-moving third-party addon
 and a blessed core type and would cause overall confusion.

And the other option would be to *only* have a fast-moving third-party
addon, which simply disqualifies it completely in many environments.

Keeping it as a third party addon is better for the developer. Keeping
it in core is better for the user (if the user is a large company -
not a hacker).

If we can find a way to have a stable part in core and then have
addons that can provide these tons of interesting features (which I
agree there are) until such time that they can be considered stable
enough for core, I think that's the best compromise.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure

On Sat, Dec 17, 2011 at 4:02 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote: to add some confusion on the
implementation choice, because it
 looks damn too easy now… Guile 2.0 offers an implementation of the 
 ECMAscript language and plscheme already exists as a PostgreSQL PL extension 
 for integrating with Guile.

TBH, I think that's PFC (pretty cool).
On Sun, Dec 18, 2011 at 6:41 AM, Magnus Hagander mag...@hagander.net wrote:
 We can hopefully get around this for the extensions in contrib (and
 reasonably well has already), but few large companies are going to be
 happy to go to pgxn and download an extension that has a single
 maintainer (not the team, and in most cases not even a team),
 usually no defined lifecycle, no support, etc. (I'm pretty sure you
 won't get support included for random pgxn modules when you buy a
 contract from EDB, or CMD, or us, or PGX, or anybody really - wheras
 if it the datatype is in core, you *will* get this)

100% agree on all points.  with the new extension system, contrib
modules that are packaged with the core system can be considered to be
in core because they are:
*) documented in standard docs
*) supported and bugfixed with postgresql releases
*) ready to be used without compiler support or even shell access
through most binary distributions

One small note about the json type being an extension -- this probably
means the json type oid won't be fixed -- not a huge deal but it could
affect some corner cases with binary format consumers.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Andrew Dunstan




On 12/18/2011 12:17 PM, Merlin Moncure wrote:

One small note about the json type being an extension -- this probably
means the json type oid won't be fixed -- not a huge deal but it could
affect some corner cases with binary format consumers.





Why would that matter more for JSON than for any other non-core type?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure

On Sun, Dec 18, 2011 at 11:21 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 12/18/2011 12:17 PM, Merlin Moncure wrote:

 One small note about the json type being an extension -- this probably
 means the json type oid won't be fixed -- not a huge deal but it could
 affect some corner cases with binary format consumers.

 Why would that matter more for JSON than for any other non-core type?

well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
and only then for high traffic types (which presumably json will be).
 a while back we coded up a reworked dblink that was variadic and
could optionally transfer data between database with the binary wire
format.   any container of a user defined (by oid) type had to be sent
strictly as text which is a big performance hit for certain types.
recent postgres has an undocumented facility to force type oids to a
particular value, but the type definition being inside the create
extension script makes this problematic.

this is a pretty far out objection though, and I could certainly work
around the problem if necessary, but there is some dependency on
pg_types.h in the wild.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Dimitri Fontaine

Merlin Moncure mmonc...@gmail.com writes:
 Why would that matter more for JSON than for any other non-core type?

 well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
 and only then for high traffic types (which presumably json will be).

Extensions are going to be more and more used and “pervasive” in next
years, and binary wire transfers is a good goal.  What about creating
something like the PostgreSQL types IANA?

New type authors would register their OID and as a benefit would get
listed on some public reference sheet, and we could add some mechanism
so that default CREATE TYPE calls will not use reserved OID numbers.

Then it would be all cooperative only, so not a security thing, just a
way to ease binary and extension co-existence.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-18 Thread Merlin Moncure

On Sun, Dec 18, 2011 at 12:26 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Why would that matter more for JSON than for any other non-core type?

 well, it's a minor headache for all the oid-isn't-in-pgtypes.h types,
 and only then for high traffic types (which presumably json will be).

 Extensions are going to be more and more used and “pervasive” in next
 years, and binary wire transfers is a good goal.  What about creating
 something like the PostgreSQL types IANA?

 New type authors would register their OID and as a benefit would get
 listed on some public reference sheet, and we could add some mechanism
 so that default CREATE TYPE calls will not use reserved OID numbers.

 Then it would be all cooperative only, so not a security thing, just a
 way to ease binary and extension co-existence.

I think that's a fabulous idea,although we're drifting off the stated
topic here.


Getting back on point, I'm curious about your statement: without
writing a single line of C.  I took a look at the pl/scheme docs and
was pretty impressed -- what exactly would be involved to get a
guile-based ECMAscript working over the pl/scheme implementation?  How
would that interact exactly with the stated topic -- JSON support?  Do
you even need a json type if you have strong library based parsing and
composition features?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-17 Thread Simon Riggs

On Sat, Dec 17, 2011 at 2:26 AM, Robert Haas robertmh...@gmail.com wrote:

 In the spirit of Simon's suggestion that we JFDI, I cooked up a patch
 today that JFDI.  See attached.

Which looks very good.

Comments
* Comment for IDENTIFICATION of json.c says contrib/json/json.c
* json.c contains a duplicate of a line from header file extern Datum
json_in(PG_FUNCTION_ARGS);

And additionally, a quote from our fine manual...

Caution: Some XML-related functions may not work at all on non-ASCII
data when the server encoding is not UTF-8. This is known to be an
issue for xpath() in particular.  so I think this approach works
for JSON too.

Adding tests and docs is a must, nothing else is right now. Once we
have this, others can add the bells and whistles, possibly in 9.2

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON for PG 9.2

2011-12-17 Thread Dimitri Fontaine

Hi,

Peter Eisentraut pete...@gmx.net writes:
 The way forward here is to maintain this as an extension, provide debs
 and rpms, and show that that is maintainable.  I can see numerous
 advantages in maintaining a PL outside the core; especially if you are
 still starting up and want to iterate quickly.

I'd like to add some confusion on the implementation choice, because it
looks damn too easy now… Guile 2.0 offers an implementation of the
ECMAscript language and plscheme already exists as a PostgreSQL PL
extension for integrating with Guile.

  http://plscheme.projects.postgresql.org/

  http://wingolog.org/archives/2009/02/22/ecmascript-for-guile
  http://packages.debian.org/sid/guile-2.0

  http://www.gnu.org/software/guile/
  
  Guile is an extension language platform

  Guile is an efficient virtual machine that executes a portable
  instruction set generated by its optimizing compiler, and integrates
  very easily with C and C++ application code. In addition to Scheme,
  Guile includes compiler front-ends for ECMAScript and Emacs Lisp
  (support for Lua is underway), which means your application can be
  extended in the language (or languages) most appropriate for your user
  base. And Guile's tools for parsing and compiling are exposed as part
  of its standard module set, so support for additional languages can be
  added without writing a single line of C.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 154 matches

Mail list logo