Re: [HACKERS] json accessors
On Dec 5, 2012, at 11:51 AM, Andrew Dunstan wrote: >> So I'm happy with this stuff, as long as it does not get in the way of >> supporting indexing at some point in the future. I can’t wait to start using >> it! > > I don't see why it should get in the way of anything like that. If anything, > the parser design changes I have proposed should make later development much > easier. Awesome, thanks! David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 12:42 PM, David E. Wheeler wrote: > On Dec 5, 2012, at 9:57 AM, Merlin Moncure wrote: > >> Indexing large documents for fancy querying is a niche case but also >> quite complex. This isn't very well covered by xmlpath either btw -- >> I think for inspiration we should be looking at hstore. > > Agreed, although hstore, IIRC, does not support nesting. > >> That said, how would you do that? The first thing that jumps into my >> mind is to cut right to the chase: Maybe the semantics could be >> defined so that implement hackstack @> needle would reasonable cover >> most cases. > > Yes. > >> So my takeaways are: >> *) decomposition != precise searching. andrew's api handles the >> former and stands on it's own merits. > > Agreed. > >> *) xmlpath/jsonpath do searching (and decomposition) but are very >> clunky from sql perspective and probably absolutely nogo in terms if >> GIST/GIN. postgres spiritually wants to do things via operators and >> we should (if possible) at least consider that first > > I don't understand how xmlpath/jsonpath is not able to be implemented with > operators. yeah -- i phrased that badly -- by 'operators' I meant that on both sides would be json document with absolute minimum fanciness such as wildcards and predicate matches. basically, 'overlaps' and (especially) 'contains'. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 01:48 PM, David E. Wheeler wrote: I'm sorry what I have offered isn't what you want, but plenty of other people have told me it will go a long way meeting their needs. *Sigh.* I guess I have not been clear. The stuff you propose is *awesome*. I love it. The syntax with the chaining operators warms my heart, and I can’t wait to make *extensive* use of it in my procedural code. Maybe I would never *need* to do column queries of JSON contents often enough to require an expensive index. OK, sorry if I misunderstood. I guess I'm trying pretty hard to concentrate on what can be accomplished now, and other people are talking about blue sky possibilities. So I'm happy with this stuff, as long as it does not get in the way of supporting indexing at some point in the future. I can’t wait to start using it! I don't see why it should get in the way of anything like that. If anything, the parser design changes I have proposed should make later development much easier. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 01:49 PM, Josh Berkus wrote: *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. The set of paths for a single json datum can be huge, as opposed to one for a single ltree datum. That strikes me as a serious barrier. In any case, nobody I know of is even offering to do this - when they do we can look at the design. Until then I'm assuming nothing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 2012-12-05 10:49:35 -0800, Josh Berkus wrote: > > >> *) xmlpath/jsonpath do searching (and decomposition) but are very > >> clunky from sql perspective and probably absolutely nogo in terms if > >> GIST/GIN. postgres spiritually wants to do things via operators and > >> we should (if possible) at least consider that first > > Why is it a nogo for GiST? Ltree works, doesn't it? If we only support > equality lookups in what way is a JSON doc different from a collection > of ltree rows? The space requirement for the paths are quite different. Its not that hard to build indexing support, its hard to build efficient support. The more you hide from postgres (i.e. behind a single very complex operator/function) the harder it is for the planner to detect whether your expression is indexable or not. > We'd probably want to use SP-GiST for better index size/performance, but > I don't see that this is impossible. Just some difficult code. I don't immediately see why SP-Gist would be be beneficial. What kind of access structure do you have in mind? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 12:49 PM, Josh Berkus wrote: > >>> *) xmlpath/jsonpath do searching (and decomposition) but are very >>> clunky from sql perspective and probably absolutely nogo in terms if >>> GIST/GIN. postgres spiritually wants to do things via operators and >>> we should (if possible) at least consider that first > > Why is it a nogo for GiST? Ltree works, doesn't it? If we only support > equality lookups in what way is a JSON doc different from a collection > of ltree rows? > > We'd probably want to use SP-GiST for better index size/performance, but > I don't see that this is impossible. Just some difficult code. huh -- good point. xpath at least is quite complicated and likely impractical (albeit not impossible) to marry with GIST in a meaningful way. jsonpath (at least AIUI from here: http://code.google.com/p/json-path/) seems to be lighter weight as is all things json when stacked up against xml. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
>> *) xmlpath/jsonpath do searching (and decomposition) but are very >> clunky from sql perspective and probably absolutely nogo in terms if >> GIST/GIN. postgres spiritually wants to do things via operators and >> we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 10:04 AM, Andrew Dunstan wrote: > Indexing tree-like data isn't at all easy. We don't index XML either. There > has been discussion of this sort of indexing it in the past, and a couple of > people have said they would work on it, but I have not seen a proposal or a > single line of code. Yeah, I forgot that xmlpath was not indexable. > Jsonpath on its own would not do what you're suggesting. A first approach to > indexing treeish data requires that you generate all the possible paths and > index that. That would be quite explosive in volume. And anyway, jsonpath is > not on offer here. Yeah, explosive for sure, but for sufficiently small JSON values, that shouldn’t be much of an issue. I expect GINs to be expensive anyway (see full-text indexing). I am not invested in jsonpath; I just cited it as an example of using a single function call to do a nested search. Obviously `json_get(json, variadic text)` allows this, too, and could potentially use a GIN index of a JSON tree to perform the variadic text search at some point in the future, yes? > I'm sorry what I have offered isn't what you want, but plenty of other people > have told me it will go a long way meeting their needs. *Sigh.* I guess I have not been clear. The stuff you propose is *awesome*. I love it. The syntax with the chaining operators warms my heart, and I can’t wait to make *extensive* use of it in my procedural code. Maybe I would never *need* to do column queries of JSON contents often enough to require an expensive index. So I'm happy with this stuff, as long as it does not get in the way of supporting indexing at some point in the future. I can’t wait to start using it! Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 9:57 AM, Merlin Moncure wrote: > Indexing large documents for fancy querying is a niche case but also > quite complex. This isn't very well covered by xmlpath either btw -- > I think for inspiration we should be looking at hstore. Agreed, although hstore, IIRC, does not support nesting. > That said, how would you do that? The first thing that jumps into my > mind is to cut right to the chase: Maybe the semantics could be > defined so that implement hackstack @> needle would reasonable cover > most cases. Yes. > So my takeaways are: > *) decomposition != precise searching. andrew's api handles the > former and stands on it's own merits. Agreed. > *) xmlpath/jsonpath do searching (and decomposition) but are very > clunky from sql perspective and probably absolutely nogo in terms if > GIST/GIN. postgres spiritually wants to do things via operators and > we should (if possible) at least consider that first I don't understand how xmlpath/jsonpath is not able to be implemented with operators. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:29 PM, David E. Wheeler wrote: On Dec 5, 2012, at 9:21 AM, Andrew Dunstan wrote: For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Probably not. That greatly reduces its utility for querying, though not, of course, for using it in procedural code. Wouldn't using a jsonpath-style implementation allow for indexing? Indexing tree-like data isn't at all easy. We don't index XML either. There has been discussion of this sort of indexing it in the past, and a couple of people have said they would work on it, but I have not seen a proposal or a single line of code. Jsonpath on its own would not do what you're suggesting. A first approach to indexing treeish data requires that you generate all the possible paths and index that. That would be quite explosive in volume. And anyway, jsonpath is not on offer here. I'm sorry what I have offered isn't what you want, but plenty of other people have told me it will go a long way meeting their needs. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 11:14 AM, David E. Wheeler wrote: > On Nov 28, 2012, at 4:10 PM, Merlin Moncure wrote: > >>> Yes, it's iterative. And for deeply nested json it might be somewhat >>> inefficient, although the parser is pretty fast AFAICT. But it's a start. >> >> not completely buying that: see comments below. not supporting xpath >> style decompositions seems wrong to me. IOW, json_get should be set >> returning (perhaps via wild cards in the keytext) or we need >> json_each. > > The problem I see with the current proposal is that this limitation, it seems > to me, would prevent the ability to index nested keys. If you're essentially > composing and decomposing JSON values as you drill down, the intermediate > JSON values between the original one and the final return value can't be > indexed, can they? > > For sufficiently large columns, I expect I would want a GIN index to speed > JSON value extraction queries. Possible with this proposal? I think best practices for JSON manipulation (at least in performance sensitive cases with large documents) are going to be to fully decompose into sql structures and manipulate after the fact. JSON's primary role is to serve as data exchange and Andrew's API (with the tweaks he came up with) seems to facilitate that pretty well; full decomposition is a snap. Indexing large documents for fancy querying is a niche case but also quite complex. This isn't very well covered by xmlpath either btw -- I think for inspiration we should be looking at hstore. That said, how would you do that? The first thing that jumps into my mind is to cut right to the chase: Maybe the semantics could be defined so that implement hackstack @> needle would reasonable cover most cases. So my takeaways are: *) decomposition != precise searching. andrew's api handles the former and stands on it's own merits. *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 9:21 AM, Andrew Dunstan wrote: >> For sufficiently large columns, I expect I would want a GIN index to speed >> JSON value extraction queries. Possible with this proposal? > > Probably not. That greatly reduces its utility for querying, though not, of course, for using it in procedural code. Wouldn't using a jsonpath-style implementation allow for indexing? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:14 PM, David E. Wheeler wrote: On Nov 28, 2012, at 4:10 PM, Merlin Moncure wrote: Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. The problem I see with the current proposal is that this limitation, it seems to me, would prevent the ability to index nested keys. If you're essentially composing and decomposing JSON values as you drill down, the intermediate JSON values between the original one and the final return value can't be indexed, can they? For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Probably not. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:11 PM, David E. Wheeler wrote: On Dec 4, 2012, at 10:05 AM, Josh Berkus wrote: json_get(json, variadic text) => json Given that I already do the equivalent in Python, this would suit me well. Not sure about other users ... Well, given that sometimes you will have mixed arrays and objects, how would you distinguish "42" as an object key or an array index? if the thing is an array, test to see if the string is a valid integer string, and if so use the integer value. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Nov 28, 2012, at 4:10 PM, Merlin Moncure wrote: >> Yes, it's iterative. And for deeply nested json it might be somewhat >> inefficient, although the parser is pretty fast AFAICT. But it's a start. > > not completely buying that: see comments below. not supporting xpath > style decompositions seems wrong to me. IOW, json_get should be set > returning (perhaps via wild cards in the keytext) or we need > json_each. The problem I see with the current proposal is that this limitation, it seems to me, would prevent the ability to index nested keys. If you're essentially composing and decomposing JSON values as you drill down, the intermediate JSON values between the original one and the final return value can't be indexed, can they? For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Best, David PS: SOrry for the delayed replies, digging my way out of a couple weeks of back posts… -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 4, 2012, at 10:05 AM, Josh Berkus wrote: >>json_get(json, variadic text) => json > > Given that I already do the equivalent in Python, this would suit me > well. Not sure about other users ... Well, given that sometimes you will have mixed arrays and objects, how would you distinguish "42" as an object key or an array index? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
> Yes, you are, rather. It might be possible to do something like: > > json_get(json, variadic text) => json Given that I already do the equivalent in Python, this would suit me well. Not sure about other users ... > as long as it doesn't involve any testing beyond field name / array > index equivalence. I'm sure people will *ask* for more in the future, but you could do a LOT with just an equivalence version. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/03/2012 08:14 PM, Josh Berkus wrote: Andrew, What about doing: json_get(json, json) returns json where parameter #2 is a path expressed as JSON? For example, json_get(personal_profile, '[ {contact_info {phone numbers {cell phones} } } ]') ... would return whatever was in that heirarchical object, in this case an array of cell phone numbers. Or am I just reinventing jsonpath? Yes, you are, rather. It might be possible to do something like: json_get(json, variadic text) => json as long as it doesn't involve any testing beyond field name / array index equivalence. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
Andrew, What about doing: json_get(json, json) returns json where parameter #2 is a path expressed as JSON? For example, json_get(personal_profile, '[ {contact_info {phone numbers {cell phones} } } ]') ... would return whatever was in that heirarchical object, in this case an array of cell phone numbers. Or am I just reinventing jsonpath? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 10:59 AM, Hannu Krosing wrote: Btw, how does current json type handle code pages - is json always utf-8 even when server encoding is not ? if so then we could at least have a shortcut conversion of json to utf8-text which can skip codepage changes. IIRC json is stored and processed in the server encoding. Normally it would make sense to have that be utf8. It is delivered to the client in the client encoding. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 04:29 PM, Andrew Dunstan wrote: On 11/30/2012 10:04 AM, Hannu Krosing wrote: OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. Well, it would be possible, sure. I'm not sure how useful. Or we could do both fairly easily. It's not as simple or efficient as you might think to dequote / de-escape json string values, which is why the original API had variants for returning both types of values. Maybe we need a function for doing just that. Btw, how does current json type handle code pages - is json always utf-8 even when server encoding is not ? if so then we could at least have a shortcut conversion of json to utf8-text which can skip codepage changes. -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 10:04 AM, Hannu Krosing wrote: OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. Well, it would be possible, sure. I'm not sure how useful. Or we could do both fairly easily. It's not as simple or efficient as you might think to dequote / de-escape json string values, which is why the original API had variants for returning both types of values. Maybe we need a function for doing just that. * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects json_unnest_each => setof (int, text, json) ditto. * populate_record(record, json) => record errors if the json isn't a flat object errors if the values are not castable to records field types nb! some nonflatness is castable. especially to json or hstore or record types If the record has a json field, certainly. If it has a record field, fairly likely. hstore could probably be a problem given it's not a core type. Similarly to the generation functions discussed in another thread, I could possibly look up a cast from json to the non-core type and use it. That might work for hstore. I'll try to keep this as permissive as possible. * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects ditto ditto ;-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 03:38 PM, Andrew Dunstan wrote: On 11/29/2012 06:34 PM, Merlin Moncure wrote: On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan wrote: There are many things wrong with this. First, converting to hstore so you can call populate_record is quite horrible and ugly and inefficient. And it's dependent on having hstore loaded - you can't have an hstore_to_jon in core because hstore itself isn't in core. If you want a populate_record that takes data from json we should have one coded direct. I'm happy to add it to the list as long as everyone understands the limitations. Given a function to unnest the json array, which I already suggested upthread, you could do what you suggested above much more elegantly and directly. I wasn't suggesting you added the hstore stuff and I understand perfectly well the awkwardness of the hstore route. That said, this is how people are going to use your api so it doesn't hurt to go through the motions; I'm just feeling out how code in the wild would shape up. Anyways, my example was busted since you'd need an extra step to move the set returning output from the json array unnest() into a 'populate_record' type function call. So, AIUI I think you're proposing (i'm assuming optional quotes) following my example above: INSERT INTO foo(a,b) SELECT json_get_as_text(v, 'a')::int, json_get_as_text(v, 'b')::int FROM json_each() v; /* gives you array of json (a,b) records */ a hypothetical 'json_to_record (cribbing usage from populate_record)' variant might look like (please note, I'm not saying 'write this now', just feeling it out):: INSERT INTO foo(a,b) SELECT r.* FROM json_each() v, LATERAL json_to_record(null::foo, v) r; you're right: that's pretty clean. An json_object_each(json), => key, value couldn't hurt either -- this would handle those oddball cases of really wide objects that you occasionally see in json. Plus as_text variants of both each and object_each. If you're buying json_object_each, I think you can scrap json_object_keys(). OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object Why not json_each(json) => setof (text, json) ? with no erroring out ? if the json does represent text it is easy to convert to text on the query side. * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects json_unnest_each => setof (int, text, json) * populate_record(record, json) => record errors if the json isn't a flat object errors if the values are not castable to records field types nb! some nonflatness is castable. especially to json or hstore or record types * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects ditto Note that I've added a couple of things to deal with json that represents a recordset (i.e. an array of objects). This is a very common pattern and one well worth optimizing for. I think that would let you do a lot of what you want pretty cleanly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Fri, Nov 30, 2012 at 9:02 AM, Andrew Dunstan wrote: > > On 11/30/2012 09:51 AM, Merlin Moncure wrote: >> >> >> Two questions: >> 1) is it possible for these to work without a polymorphic object >> passed through as hstore does (null::foo)? >> select populate_record(anyelement, record, json) > > > I don't understand the question. The API I'm suggesting is exactly in line > with hstore's, which uses a polymorphic parameter. I don't see how it can > not, and I don't understand why you would have 3 parameters. my mistake: I misread the function as you write it. it's good as is. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/30/2012 09:51 AM, Merlin Moncure wrote: Two questions: 1) is it possible for these to work without a polymorphic object passed through as hstore does (null::foo)? select populate_record(anyelement, record, json) I don't understand the question. The API I'm suggesting is exactly in line with hstore's, which uses a polymorphic parameter. I don't see how it can not, and I don't understand why you would have 3 parameters. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Fri, Nov 30, 2012 at 8:38 AM, Andrew Dunstan wrote: > OK, so based on this discussion, I'm thinking of the following: ok, this is looking awesome -- couple naming suggestions (see inline): > * keep the original functions and operators. json_keys is still >required for the case where the json is not flat. > * json_each(json) => setof (text, text) >errors if the json is not a flat object > * json_unnest(json) => setof json >errors if the json is not an array I wonder if usage of 'unnest' is appropriate: sql unnest() *completely* unwraps the array to a list of scalars where as json unnest() only peels of one level. If you agree with that (it's debatable), how about json_array_each()? > * json_unnest_each => setof (int, text, text) >errors if the array is not an array of flat objects I like this. Maybe json_object_each() if you agree with my analysis above. > * populate_record(record, json) => record >errors if the json isn't a flat object > * populate_recordset(record, json) => setof record >errors if the json is not an array of flat objects Two questions: 1) is it possible for these to work without a polymorphic object passed through as hstore does (null::foo)? select populate_record(anyelement, record, json) 2) in keeping with naming style of json api, how about json_to_record, json_to_recordset? Maybe though keeping similarity with hstore convention is more important. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 06:34 PM, Merlin Moncure wrote: On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan wrote: There are many things wrong with this. First, converting to hstore so you can call populate_record is quite horrible and ugly and inefficient. And it's dependent on having hstore loaded - you can't have an hstore_to_jon in core because hstore itself isn't in core. If you want a populate_record that takes data from json we should have one coded direct. I'm happy to add it to the list as long as everyone understands the limitations. Given a function to unnest the json array, which I already suggested upthread, you could do what you suggested above much more elegantly and directly. I wasn't suggesting you added the hstore stuff and I understand perfectly well the awkwardness of the hstore route. That said, this is how people are going to use your api so it doesn't hurt to go through the motions; I'm just feeling out how code in the wild would shape up. Anyways, my example was busted since you'd need an extra step to move the set returning output from the json array unnest() into a 'populate_record' type function call. So, AIUI I think you're proposing (i'm assuming optional quotes) following my example above: INSERT INTO foo(a,b) SELECT json_get_as_text(v, 'a')::int, json_get_as_text(v, 'b')::int FROM json_each() v; /* gives you array of json (a,b) records */ a hypothetical 'json_to_record (cribbing usage from populate_record)' variant might look like (please note, I'm not saying 'write this now', just feeling it out):: INSERT INTO foo(a,b) SELECT r.* FROM json_each() v, LATERAL json_to_record(null::foo, v) r; you're right: that's pretty clean. An json_object_each(json), => key, value couldn't hurt either -- this would handle those oddball cases of really wide objects that you occasionally see in json. Plus as_text variants of both each and object_each. If you're buying json_object_each, I think you can scrap json_object_keys(). OK, so based on this discussion, I'm thinking of the following: * keep the original functions and operators. json_keys is still required for the case where the json is not flat. * json_each(json) => setof (text, text) errors if the json is not a flat object * json_unnest(json) => setof json errors if the json is not an array * json_unnest_each => setof (int, text, text) errors if the array is not an array of flat objects * populate_record(record, json) => record errors if the json isn't a flat object * populate_recordset(record, json) => setof record errors if the json is not an array of flat objects Note that I've added a couple of things to deal with json that represents a recordset (i.e. an array of objects). This is a very common pattern and one well worth optimizing for. I think that would let you do a lot of what you want pretty cleanly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan wrote: > There are many things wrong with this. First, converting to hstore so you > can call populate_record is quite horrible and ugly and inefficient. And > it's dependent on having hstore loaded - you can't have an hstore_to_jon in > core because hstore itself isn't in core. If you want a populate_record that > takes data from json we should have one coded direct. I'm happy to add it to > the list as long as everyone understands the limitations. Given a function > to unnest the json array, which I already suggested upthread, you could do > what you suggested above much more elegantly and directly. I wasn't suggesting you added the hstore stuff and I understand perfectly well the awkwardness of the hstore route. That said, this is how people are going to use your api so it doesn't hurt to go through the motions; I'm just feeling out how code in the wild would shape up. Anyways, my example was busted since you'd need an extra step to move the set returning output from the json array unnest() into a 'populate_record' type function call. So, AIUI I think you're proposing (i'm assuming optional quotes) following my example above: INSERT INTO foo(a,b) SELECT json_get_as_text(v, 'a')::int, json_get_as_text(v, 'b')::int FROM json_each() v; /* gives you array of json (a,b) records */ a hypothetical 'json_to_record (cribbing usage from populate_record)' variant might look like (please note, I'm not saying 'write this now', just feeling it out):: INSERT INTO foo(a,b) SELECT r.* FROM json_each() v, LATERAL json_to_record(null::foo, v) r; you're right: that's pretty clean. An json_object_each(json), => key, value couldn't hurt either -- this would handle those oddball cases of really wide objects that you occasionally see in json. Plus as_text variants of both each and object_each. If you're buying json_object_each, I think you can scrap json_object_keys(). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 04:52 PM, Merlin Moncure wrote: On Thu, Nov 29, 2012 at 1:19 PM, Andrew Dunstan wrote: On 11/29/2012 01:06 PM, Merlin Moncure wrote: so, just hashing out your proposal and making sure it's reasonable analogous implementation of xpath. Sleeping on it, I say mostly, but not quite. how about some changes for json_get: 1) return setof (key, value) in the style of jquery each(). 2) we need some way of indicating in the keytext path that we want to unnest the collecton pointed to by keytext or to just return it. for example, ->* as indicator? 3) use double quotes, and make them optional (as hstore) 4) speaking of hstore, prefer => vs ->?So I don't think your modifications are well thought out. if you do at least #1 and #2, json_get I think can cover all the bases for parsing json, meaning you could reproduce the behaviors for each of your four proposed just as xpath does for xml. (you may still want to add them for posterity or performance though). so no need for json_each or json_array_unnest etc. json_get is designed to return a single thing. What is more, returning a (key, value) pair seems quite silly when you're passing the key in as an argument. It's not designed to be json_path or json_query, and it's not designed either to take a path expression as an argument. So I don't think this is a good direction. Your proposed mods to json_get modify it out of all recognition. If I offer you a horse and ask what colour you'd like, asking for a lion instead isn't a good response :-) (Repeating myself), I also suggest exposing the transform API so that it will be easy to construct further functions as extensions. I'm not trying to cover the field. The intention here is to provide some very basic json accessors as core functions / operators. Right. But you're not offering a horse to the farm...but to the zoo. json is in core so I don't think you have the luxury of offering a clunky API now withe expectation of a sleeker, faster one in the future as the old functions will sit around forever in the public namespace. What is present in the API doesn't have to cover all reasonable use cases but it certainly should be expected withstand the test of time for the cases it does cover. Sketch out how a object array of indeterminate size would be parsed and placed into records with a set returning/array returning and non-set returning json_get: which is a better fit? xpath() doesn't work iteratively and nobody has ever complained about that to my recollection. table: create table foo (a int, b int); document: [{"a": 1, "b": 2}, {"a": 3, "b": 4}, ... {"a": 9, "b": 10}] set returning json_get: INSERT INTO foo SELECT * FROM populate_record(null, hstore_to_json((json_get(*)).value)); assuming '*' is the 'expand this' operator in your 'keytext' expression that I was suggestion. How would this work with your proposed API? This is a very typical use case. There are many things wrong with this. First, converting to hstore so you can call populate_record is quite horrible and ugly and inefficient. And it's dependent on having hstore loaded - you can't have an hstore_to_jon in core because hstore itself isn't in core. If you want a populate_record that takes data from json we should have one coded direct. I'm happy to add it to the list as long as everyone understands the limitations. Given a function to unnest the json array, which I already suggested upthread, you could do what you suggested above much more elegantly and directly. Also, BTW, you can't use * that way. We are not replicating xpath here for json. Sorry, but that's not my goal. If you want to code that up, be my guest. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Thu, Nov 29, 2012 at 1:19 PM, Andrew Dunstan wrote: > > On 11/29/2012 01:06 PM, Merlin Moncure wrote: >> >> so, just hashing out your proposal and making sure it's reasonable >> analogous implementation of xpath. Sleeping on it, I say mostly, but >> not quite. how about some changes for json_get: >> >> 1) return setof (key, value) in the style of jquery each(). >> 2) we need some way of indicating in the keytext path that we want to >> unnest the collecton pointed to by keytext or to just return it. for >> example, ->* as indicator? >> 3) use double quotes, and make them optional (as hstore) >> 4) speaking of hstore, prefer => vs ->?So I don't think your modifications >> are well thought out. >> >> >> if you do at least #1 and #2, json_get I think can cover all the bases >> for parsing json, meaning you could reproduce the behaviors for each >> of your four proposed just as xpath does for xml. (you may still >> want to add them for posterity or performance though). so no need for >> json_each or json_array_unnest etc. > > > json_get is designed to return a single thing. What is more, returning a > (key, value) pair seems quite silly when you're passing the key in as an > argument. It's not designed to be json_path or json_query, and it's not > designed either to take a path expression as an argument. So I don't think > this is a good direction. Your proposed mods to json_get modify it out of > all recognition. If I offer you a horse and ask what colour you'd like, > asking for a lion instead isn't a good response :-) > > (Repeating myself), I also suggest exposing the transform API so that it > will be easy to construct further functions as extensions. I'm not trying to > cover the field. The intention here is to provide some very basic json > accessors as core functions / operators. Right. But you're not offering a horse to the farm...but to the zoo. json is in core so I don't think you have the luxury of offering a clunky API now withe expectation of a sleeker, faster one in the future as the old functions will sit around forever in the public namespace. What is present in the API doesn't have to cover all reasonable use cases but it certainly should be expected withstand the test of time for the cases it does cover. Sketch out how a object array of indeterminate size would be parsed and placed into records with a set returning/array returning and non-set returning json_get: which is a better fit? xpath() doesn't work iteratively and nobody has ever complained about that to my recollection. table: create table foo (a int, b int); document: [{"a": 1, "b": 2}, {"a": 3, "b": 4}, ... {"a": 9, "b": 10}] set returning json_get: INSERT INTO foo SELECT * FROM populate_record(null, hstore_to_json((json_get(*)).value)); assuming '*' is the 'expand this' operator in your 'keytext' expression that I was suggestion. How would this work with your proposed API? This is a very typical use case. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 01:06 PM, Merlin Moncure wrote: so, just hashing out your proposal and making sure it's reasonable analogous implementation of xpath. Sleeping on it, I say mostly, but not quite. how about some changes for json_get: 1) return setof (key, value) in the style of jquery each(). 2) we need some way of indicating in the keytext path that we want to unnest the collecton pointed to by keytext or to just return it. for example, ->* as indicator? 3) use double quotes, and make them optional (as hstore) 4) speaking of hstore, prefer => vs ->?So I don't think your modifications are well thought out. if you do at least #1 and #2, json_get I think can cover all the bases for parsing json, meaning you could reproduce the behaviors for each of your four proposed just as xpath does for xml. (you may still want to add them for posterity or performance though). so no need for json_each or json_array_unnest etc. json_get is designed to return a single thing. What is more, returning a (key, value) pair seems quite silly when you're passing the key in as an argument. It's not designed to be json_path or json_query, and it's not designed either to take a path expression as an argument. So I don't think this is a good direction. Your proposed mods to json_get modify it out of all recognition. If I offer you a horse and ask what colour you'd like, asking for a lion instead isn't a good response :-) (Repeating myself), I also suggest exposing the transform API so that it will be easy to construct further functions as extensions. I'm not trying to cover the field. The intention here is to provide some very basic json accessors as core functions / operators. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Thu, Nov 29, 2012 at 7:58 AM, Andrew Dunstan wrote: > On 11/28/2012 08:16 PM, Hannu Krosing wrote: >> You could even do a template-less row_from_json which returns a records >> with all fields converted to >> the JSON-encodable types and hope that the next conversions will be done >> by postgreSQL as needed. >> >> insert into tab1 select row_from_json('{"id":100, "ts":"2012-12-21", >> "data":"End of Everything"}'); >> >> insert into tab1 >> select * from row_from_json( >> '[{"id":101, "ts":"2012-12-22", "data":"1st day after End of Everything"} >> {"id":102, "ts":"2012-12-22", "data":"2nd day after End of Everything"} >> ]'); > > The real problem here is that for any irregularly shaped json it's likely to > be a bust, and could only possibly work sanely for nested json at all if the > target type had corresponding array and composite fields. again, that's pretty a fairly typical case -- crafting json documents specifically for consumption in postgres. defining backend types allows you to skip intermediate iterative marshaling step. > hstore's > populate_record works fairly well precisely because hstore is a flat > structure, unlike json. agreed. not trying to drag you into the weeds here. the above is neat functionality but doesn't cover all the cases so specific accessor functions in the vein of your proposal are still needed and the hstore workaround should work pretty well -- sugaring up the syntax for 'all in wonder' type translations of complicated structures can be done later if you want to keep things simple in the short term. so, just hashing out your proposal and making sure it's reasonable analogous implementation of xpath. Sleeping on it, I say mostly, but not quite. how about some changes for json_get: 1) return setof (key, value) in the style of jquery each(). 2) we need some way of indicating in the keytext path that we want to unnest the collecton pointed to by keytext or to just return it. for example, ->* as indicator? 3) use double quotes, and make them optional (as hstore) 4) speaking of hstore, prefer => vs ->? if you do at least #1 and #2, json_get I think can cover all the bases for parsing json, meaning you could reproduce the behaviors for each of your four proposed just as xpath does for xml. (you may still want to add them for posterity or performance though). so no need for json_each or json_array_unnest etc. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/28/2012 08:16 PM, Hannu Krosing wrote: On 11/29/2012 02:07 AM, Hannu Krosing wrote: On 11/29/2012 01:10 AM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan wrote: ... *) have you considered something like anyelement from_json(anyelement, json) or select ::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. One way to tackle it would give the argument element as a template and the result will the same template filled in from json filled create table tab1(id serial primary key, ts timestamp default now(), data text); insert into tab1 select from_json(row(null,null,null)::tab1, '{"data":"the data"}'); insert into tab1 select from_json(row(null,null,null)::tab1, '{"id":-1, "ts":null, "data":""}'); insert into tab1 select from_json(t.*,'{"data":"more data"}') from tab1 t where id = -1; hannu=# select row_to_json(t.*) from tab1 t; row_to_json --- {"id":1,"ts":"2012-11-29 02:01:48.379172","data":"the data"} {"id":-1,"ts":null, "data":""} {"id":2,"ts":"2012-11-29 02:02:34.600164","data":"more data"} (3 rows) if extracting the defaults from table def proves too tricky for first iteration, then just set the missing fields to NULL or even better, carry over the values from template; You could even do a template-less row_from_json which returns a records with all fields converted to the JSON-encodable types and hope that the next conversions will be done by postgreSQL as needed. insert into tab1 select row_from_json('{"id":100, "ts":"2012-12-21", "data":"End of Everything"}'); insert into tab1 select * from row_from_json( '[{"id":101, "ts":"2012-12-22", "data":"1st day after End of Everything"} {"id":102, "ts":"2012-12-22", "data":"2nd day after End of Everything"} ]'); The real problem here is that for any irregularly shaped json it's likely to be a bust, and could only possibly work sanely for nested json at all if the target type had corresponding array and composite fields. hstore's populate_record works fairly well precisely because hstore is a flat structure, unlike json. In any case, I think this sort of suggestion highlights the possible benefits of what I suggested upthread, namely to expose an API that will allow easy construction of json transformation functions as extensions. PS: good work so far :) Hannu Thanks. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 02:07 AM, Hannu Krosing wrote: On 11/29/2012 01:10 AM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan wrote: ... *) have you considered something like anyelement from_json(anyelement, json) or select ::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. One way to tackle it would give the argument element as a template and the result will the same template filled in from json filled create table tab1(id serial primary key, ts timestamp default now(), data text); insert into tab1 select from_json(row(null,null,null)::tab1, '{"data":"the data"}'); insert into tab1 select from_json(row(null,null,null)::tab1, '{"id":-1, "ts":null, "data":""}'); insert into tab1 select from_json(t.*,'{"data":"more data"}') from tab1 t where id = -1; hannu=# select row_to_json(t.*) from tab1 t; row_to_json --- {"id":1,"ts":"2012-11-29 02:01:48.379172","data":"the data"} {"id":-1,"ts":null, "data":""} {"id":2,"ts":"2012-11-29 02:02:34.600164","data":"more data"} (3 rows) if extracting the defaults from table def proves too tricky for first iteration, then just set the missing fields to NULL or even better, carry over the values from template; You could even do a template-less row_from_json which returns a records with all fields converted to the JSON-encodable types and hope that the next conversions will be done by postgreSQL as needed. insert into tab1 select row_from_json('{"id":100, "ts":"2012-12-21", "data":"End of Everything"}'); insert into tab1 select * from row_from_json( '[{"id":101, "ts":"2012-12-22", "data":"1st day after End of Everything"} {"id":102, "ts":"2012-12-22", "data":"2nd day after End of Everything"} ]'); Hannu -- Hannu PS: good work so far :) Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 01:10 AM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan wrote: ... *) have you considered something like anyelement from_json(anyelement, json) or select ::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. One way to tackle it would give the argument element as a template and the result will the same template filled in from json filled create table tab1(id serial primary key, ts timestamp default now(), data text); insert into tab1 select from_json(row(null,null,null)::tab1, '{"data":"the data"}'); insert into tab1 select from_json(row(null,null,null)::tab1, '{"id":-1, "ts":null, "data":""}'); insert into tab1 select from_json(t.*,'{"data":"more data"}') from tab1 t where id = -1; hannu=# select row_to_json(t.*) from tab1 t; row_to_json --- {"id":1,"ts":"2012-11-29 02:01:48.379172","data":"the data"} {"id":-1,"ts":null, "data":""} {"id":2,"ts":"2012-11-29 02:02:34.600164","data":"more data"} (3 rows) if extracting the defaults from table def proves too tricky for first iteration, then just set the missing fields to NULL or even better, carry over the values from template; -- Hannu PS: good work so far :) Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan wrote: > On 11/28/2012 02:08 PM, Merlin Moncure wrote: >> *) ISTM your keytext operators are a reasonable replacement for a >> hypothetical json_path. That said you're basically forcing json->sql >> mapping through a highly iterative API, which I don't like. At the >> very least, I think json_get should return setof json and return all >> matching constructions. I won't miss predicate tests: we can do all >> that in SQL. > > > Yes, it's iterative. And for deeply nested json it might be somewhat > inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. >> Non-trivial json productions in postgres require the creation of >> special composite types that structure the data that we (I?) rig up in >> SQL before routing to json. What about having functions that work in >> the opposite direction: >> >> *) can you access both arrays and records with numeric positional >> syntax (hopefully, yes?), for example: >> >> x->0->0 > > > You can't do that in JS, so I'm not clear why we should allow it. agreed -- withdrawn. >> *) json_object_keys(json) seems to special case to me. how about: >> >> json_each(json) which returns a set of key/value pairs and would on >> arrays or objects (for arrays the key could be invented from the >> index). > > Again, I don't think we should conflate the processing for arrays and > objects. But I could see doing each(json) => setof (text, json) (and maybe a > similar function returning setof (text, text), which would dequote leaf > nodes as json_get_as_text() does). > > And similarly a couple of functions to unnest arrays. Yeah. Although, I *do* think you need 'json_each' (or a set returning json_get) and they should be conflated...exactly as jquery does: http://api.jquery.com/jQuery.each/. json objects are associative arrays, right? So if the *value* that gets returned by json_each is itself a collection, we can cast back to json and recurse. at the very least, we ought to decompose large documents into arbitrary smaller chunks (as xpath does) without iterating. In most of the code I'd write, I would decompose to a json object using your stuff then route to something like: insert into foo select (r).* from populate_record(null::foo, json_to_hstore(x)) r from json_each('path->to->record_containg_array', json_document'); assuming the json was deliberately constructed to mashall cleanly into the database, which is perfectly reasonable. >> *) json_get_as_text(json, keytext or indexint) => text >> >> prefer json_to_text() naming. also json_to_hstore(), etc. > > json_to_text seems rather misleading as a name here. Maybe we could remove > the "_as" from the name if that's bothering you. hm, I think you're right here -- I see the distinction. > As for json_to_hstore, as I mentioned, the design is intended to enable the > easy constructyion of such transformations, although for hstores anything > except trivial json structure (i.e. an unnested object) it might have > unappealing results. But in any case, the important thing to do first is to > get the infrastructure in place. Time is very short and I don't want to > extend this very much. yeah, understood. >> *) have you considered something like >> anyelement from_json(anyelement, json) >> or >> select ::some_type; (this may or many not be possible given our >> casting mechanics; i don't know). > > I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. Don't have to tackle everything at once I suppose, but spiritually I'm hoping it would serve as replacement for textual record_in, array_in, etc. It's just wrong to have to specify each and every field in during parsing when the receiving structure is well defined in the database. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/28/2012 03:44 PM, Andrew Dunstan wrote: As for json_to_hstore, as I mentioned, the design is intended to enable the easy constructyion of such transformations, although for hstores anything except trivial json structure (i.e. an unnested object) it might have unappealing results. But in any case, the important thing to do first is to get the infrastructure in place. Time is very short and I don't want to extend this very much. The other thing about doing json_to_hstore() is that, since hstore is not itself a core type, we couldn't do that in the core json module, and therefore we'd either need to expose an API to the JSON parser or replicate it in the hstore module. Exposing it is probably the better way to go. Then people could write extensions that process json just by supplying the hooked functions. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/28/2012 02:08 PM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 11:04 AM, Andrew Dunstan wrote: This is a proposal to create some basic functions to extract values from json. The simple functions I envision would be: * json_object_keys(json) => setof text returns the set of dequoted, unescaped keys of the object, errors if it's not an object * json_get(json, keytext) => json returns the json value corresponding to the key text in the json object, null if not found, error if it's not an object * json_get(json, indexint) => json returns the json value of the indexth element in the json array, null of the index is outside the array bounds, errors if it's not an array * json_get_as_text(json, keytext or indexint) => text same as json_get() except that it returns dequoted, unescaped text for a quoted leaf field Comments (this is awesome btw): Thanks for the input. *) ISTM your keytext operators are a reasonable replacement for a hypothetical json_path. That said you're basically forcing json->sql mapping through a highly iterative API, which I don't like. At the very least, I think json_get should return setof json and return all matching constructions. I won't miss predicate tests: we can do all that in SQL. Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. Non-trivial json productions in postgres require the creation of special composite types that structure the data that we (I?) rig up in SQL before routing to json. What about having functions that work in the opposite direction: *) can you access both arrays and records with numeric positional syntax (hopefully, yes?), for example: x->0->0 You can't do that in JS, so I'm not clear why we should allow it. *) json_object_keys(json) seems to special case to me. how about: json_each(json) which returns a set of key/value pairs and would on arrays or objects (for arrays the key could be invented from the index). Again, I don't think we should conflate the processing for arrays and objects. But I could see doing each(json) => setof (text, json) (and maybe a similar function returning setof (text, text), which would dequote leaf nodes as json_get_as_text() does). And similarly a couple of functions to unnest arrays. *) json_get_as_text(json, keytext or indexint) => text prefer json_to_text() naming. also json_to_hstore(), etc. json_to_text seems rather misleading as a name here. Maybe we could remove the "_as" from the name if that's bothering you. As for json_to_hstore, as I mentioned, the design is intended to enable the easy constructyion of such transformations, although for hstores anything except trivial json structure (i.e. an unnested object) it might have unappealing results. But in any case, the important thing to do first is to get the infrastructure in place. Time is very short and I don't want to extend this very much. *) have you considered something like anyelement from_json(anyelement, json) or select ::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Nov 28, 2012 at 11:04 AM, Andrew Dunstan wrote: > > This is a proposal to create some basic functions to extract values from > json. The simple functions I envision would be: > > * json_object_keys(json) => setof text >returns the set of dequoted, unescaped keys of the object, >errors if it's not an object > * json_get(json, keytext) => json >returns the json value corresponding to the key text in the json object, >null if not found, error if it's not an object > * json_get(json, indexint) => json >returns the json value of the indexth element in the json array, >null of the index is outside the array bounds, errors if it's not an >array > * json_get_as_text(json, keytext or indexint) => text >same as json_get() except that it returns dequoted, unescaped text >for a quoted leaf field Comments (this is awesome btw): *) ISTM your keytext operators are a reasonable replacement for a hypothetical json_path. That said you're basically forcing json->sql mapping through a highly iterative API, which I don't like. At the very least, I think json_get should return setof json and return all matching constructions. I won't miss predicate tests: we can do all that in SQL. Non-trivial json productions in postgres require the creation of special composite types that structure the data that we (I?) rig up in SQL before routing to json. What about having functions that work in the opposite direction: *) can you access both arrays and records with numeric positional syntax (hopefully, yes?), for example: x->0->0 *) json_object_keys(json) seems to special case to me. how about: json_each(json) which returns a set of key/value pairs and would on arrays or objects (for arrays the key could be invented from the index). *) json_get_as_text(json, keytext or indexint) => text prefer json_to_text() naming. also json_to_hstore(), etc. *) have you considered something like anyelement from_json(anyelement, json) or select ::some_type; (this may or many not be possible given our casting mechanics; i don't know). My reasoning here is that for non-trivial json productions we (I?) typically use composite types to rigidly control the structure of the output document. For 'restful' type protocols I might want to use the same trick: there would be a set of nested composite type/arrays (or even, in trivial cases, a table) that would cleanly map to the document. The parsing here can and should be automatic; this would give nice symmetry with your xxx_to_json functions. Obviously conversion here would be best effort but when it works, it would be wonderful: WITH json_data AS ( SELECT from_json(null::foo[], ) ) i1 as (INSERT INTO bar SELECT ... FROM json_data) i2 as (INSERT INTO baz SELECT ... FROM json_data) where "..." would be some combination of unnest() and composite type access syntax. Now, some documents in json won't translate cleanly to composite types because json allows for heterogeneous arrays. But if we're in control of both sides of the protocol that shouldn't matter. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json accessors
This is a proposal to create some basic functions to extract values from json. The simple functions I envision would be: * json_object_keys(json) => setof text returns the set of dequoted, unescaped keys of the object, errors if it's not an object * json_get(json, keytext) => json returns the json value corresponding to the key text in the json object, null if not found, error if it's not an object * json_get(json, indexint) => json returns the json value of the indexth element in the json array, null of the index is outside the array bounds, errors if it's not an array * json_get_as_text(json, keytext or indexint) => text same as json_get() except that it returns dequoted, unescaped text for a quoted leaf field I also propose to map the json_get functions to the operator '->' and json_get_as_text to '->>', so that given x has this json value: {"a":[{"b":"c","d":"e"},{"f":true,"g":1}]} the expression x->'a'->0->>'d' will yield 'e', x->'a'->0->'f' will yield 'true' and x->'a'->0 will yield '{"b":"c","d":"e"}'. The operators would make using these a whole lot nicer :-) Various people have suggested putting json_path or something similar into the core. I'm not sure we want to do that, partly because there are several competing entries in this field, and partly because I don't want to get into the business of evaluating json predicate tests, which I think any tolerably complete gadget would need to do. Regarding implementation, the way I propose to do this is to modify the json parser a bit to turn it into a recursive descent parser, with hooks for various operations. NULL hooks would leave us with the validating parser we have now with no side effects. The hook functions themselves will be very small. This would also allow us to do other things very simply at a later stage, for example a json to xml transformation function would be very easy to construct using this infrastructure, and without disturbing any existing functionality. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers