Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Robert Haas wrote: On Sun, Aug 15, 2010 at 11:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: I knew there would be a lot of critters crawling out as soon as we turned over this rock. Which other data-formats-of-the-week shall we immortalize as core PG types? PER-encoded ASN.1, for when you really need something human-readable? :-) I like to prioritize with hits on Google as a proxy for popularity: XML: 341M CSV: 132M JSON: 96M YAML: 6M ASN.1: 1.1M BSON: 130K Protocol Buffers: 86K OGDL: 45K I think there's a strong case to add JSON, as it may very well be the most popular data-text serialization format out there not yet supported. It's certainly beyond a format of the week at this point. XML is like violence: if it doesn't solve your problem, you aren't using enough of it. - Chris Maden -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Andrew Dunstan and...@dunslane.net writes: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. Hmm, that's an interesting plan ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Mon, Aug 16, 2010 at 11:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. Hmm, that's an interesting plan ... It is interesting, but I'm not sure that it will actually work out well in practice. If what we want to do is compress JSON, TOAST will do that for us without any additional code, and probably a lot more efficiently. Of course, until someone tests it, we're just speculating wildly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Mon, Aug 16, 2010 at 11:21 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 16, 2010 at 11:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. Hmm, that's an interesting plan ... It is interesting, but I'm not sure that it will actually work out well in practice. If what we want to do is compress JSON, TOAST will do that for us without any additional code, and probably a lot more efficiently. Of course, until someone tests it, we're just speculating wildly. Yep, that was exactly what struck me. TOAST is quite likely to be a good answer for this. The reason to want some other binary format would be if there were other benefits to be had. An XML encoding format could be interesting if it allowed having GIST-ish indexes to search for tags particularly efficiently. I say XML encoding because I've not got any reason to think that a JSON/BSON-only format would necessarily be preferable. But interesting isn't the same thing as the right answer. For now, TOAST seems perfectly reasonable. If there's some wire format for XML that would allow more efficient data transfer, that would be an improvement. BSON sounds like it's something like that, but only if it's better than flavour of the week. -- http://linuxfinances.info/info/linuxdistributions.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Mon, 2010-08-16 at 11:40 -0400, Christopher Browne wrote: On Mon, Aug 16, 2010 at 11:21 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 16, 2010 at 11:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. Hmm, that's an interesting plan ... It is interesting, but I'm not sure that it will actually work out well in practice. If what we want to do is compress JSON, TOAST will do that for us without any additional code, and probably a lot more efficiently. Of course, until someone tests it, we're just speculating wildly. Yep, that was exactly what struck me. TOAST is quite likely to be a good answer for this. Except: How much JSON data will actually be TOASTed? Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On 8/16/10 8:40 AM, Christopher Browne wrote: On Mon, Aug 16, 2010 at 11:21 AM, Robert Haasrobertmh...@gmail.com wrote: On Mon, Aug 16, 2010 at 11:05 AM, Tom Lanet...@sss.pgh.pa.us wrote: Andrew Dunstanand...@dunslane.net writes: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. Hmm, that's an interesting plan ... It is interesting, but I'm not sure that it will actually work out well in practice. If what we want to do is compress JSON, TOAST will do that for us without any additional code, and probably a lot more efficiently. Of course, until someone tests it, we're just speculating wildly. Yep, that was exactly what struck me. TOAST is quite likely to be a good answer for this. The reason to want some other binary format would be if there were other benefits to be had. An XML encoding format could be interesting if it allowed having GIST-ish indexes to search for tags particularly efficiently. I say XML encoding because I've not got any reason to think that a JSON/BSON-only format would necessarily be preferable. But interesting isn't the same thing as the right answer. For now, TOAST seems perfectly reasonable. If there's some wire format for XML that would allow more efficient data transfer, that would be an improvement. BSON sounds like it's something like that, but only if it's better than flavour of the week. XML encoding has certainly been investigated within the W3C public docs: http://www.w3.org/2003/08/binary-interchange-workshop/Report.html (discussion) http://www.w3.org/TR/xbc-characterization/ (summary) Leading to the current draft of EXI: http://www.w3.org/XML/EXI/ The spec is a rather large undertaking. It makes sense to add to the XML ToDo wiki page. EXI will certainly be better than TOAST for larger XML docs. ... BSON does not compress text content -- TOAST would still have its advantages. It mainly shortens the representation of JSON data structures. Again, I think the primary benefit of BSON would be data traversal. The benefit is the same for a client receiving BSON, as the server. Data lengths are specified, allowing quick optimizations for things like key_exists and equivalencies. Client's supporting BSON could benefit from a quick pass-through. And I'd imagine a very slight benefit toward indexing, were GIN / hstore processes used. Still, as has been noted on this thread.. We don't have numbers to work with. With json as a core data type; and bson as a possible function working with the json type, there's not much of a risk going in either direction (text + TOAST, bson + TOAST). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Sun, Aug 15, 2010 at 11:47 PM, Andrew Dunstan and...@dunslane.net wrote: If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. An issue is that the current JSON data type implementation preserves the original text (meaning if you say '[1,2,\u0020 ]'::JSON, it will yield '[1,2,\u0020 ]' rather than '[1,2, ]' . I haven't researched BSON much at all, but I seriously doubt that part of its spec includes handling external JSON encoding details like whitespace and superfluous escapes. Even though I spent a long time implementing it, the original text preservation feature should be dropped, in my opinion. Users tend to care more about the data inside of a JSON value rather than how it's encoded, and replacement of values can have funky consequences on indentation when working with indented JSON text (similar to how pasting in a text editor puts pasted content at the wrong indent level). By dropping original text preservation, in the future (or now), JSON could be encoded in BSON (or a more efficient format) in the database rather than in JSON-encoded text. Also, an idea would be to make json_send and json_recv (binary JSON send/receive) use BSON rather than JSON-encoded text, as sending/receiving JSON-encoded text is exactly what text send/receive do. Joey Adams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On 16/08/10 20:17, Joseph Adams wrote: Also, an idea would be to make json_send and json_recv (binary JSON send/receive) use BSON rather than JSON-encoded text, as sending/receiving JSON-encoded text is exactly what text send/receive do. The usual reason to use the binary format is performance, so it doesn't make much sense to use BSON for that if the internal storage format is something else. It would most likely be slower, not faster, than sending the string as is. Of course, if you switch to using BSON as the internal storage format, then it's natural to use that for the binary I/O format too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On 8/15/10 8:47 PM, Andrew Dunstan wrote: On 08/15/2010 11:03 PM, Tom Lane wrote: Charles Pritchardch...@jumis.com writes: I'd originally sent this to Joseph Adams, as he has been working on adding a JSON datatype. I've suggested supporting BSON, as there are many client implementations available, I knew there would be a lot of critters crawling out as soon as we turned over this rock. Which other data-formats-of-the-week shall we immortalize as core PG types? If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. I agree that we don't want in core a huge array of general serialization formats. The one thing that JSON has going for it for general use, in my view, is that, unlike hstore, the structure is not flat. That makes it potentially useful for various purposes, especially complex structured function arguments, in places where using hstore can be rather limiting, and xml overly verbose. While I certainly haven't done homework on this -- I agree with Andrew. Storing internally as BSON (if it holds up to its premise) would mean more efficient traversal of internal objects in the future, if we were to have JSON-related functions/selectors. The core type would still be json, and would return as text, a json string, but internally it would be stored as BSON, and a function would be available, json_to_bson(typedjsoncol::json), returning a binary string. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Charles Pritchard ch...@jumis.com wrote: Storing internally as BSON (if it holds up to its premise) would mean more efficient traversal of internal objects in the future, if we were to have JSON-related functions/selectors. How about the fact that not all JSON objects can be represented in BSON (if the JSON object has a very long string), and not all BSON objects can be represented in JSON (if the BSON object has an array). Or do we invent our own flavors of one or both to cover the mismatch? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Kevin Grittner kevin.gritt...@wicourts.gov writes: Charles Pritchard ch...@jumis.com wrote: Storing internally as BSON (if it holds up to its premise) would mean more efficient traversal of internal objects in the future, if we were to have JSON-related functions/selectors. How about the fact that not all JSON objects can be represented in BSON (if the JSON object has a very long string), and not all BSON objects can be represented in JSON (if the BSON object has an array). Well, if it's not just a binary encoding of JSON, I think we can forget about it ... certainly it won't work in the form I was visualizing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
1;2403;0cOn Mon, Aug 16, 2010 at 05:02:47PM -0500, Kevin Grittner wrote: Charles Pritchard ch...@jumis.com wrote: Storing internally as BSON (if it holds up to its premise) would mean more efficient traversal of internal objects in the future, if we were to have JSON-related functions/selectors. How about the fact that not all JSON objects can be represented in BSON (if the JSON object has a very long string) Any such long string wont be representable in pg anyway. Or am I missing something here? Besides that I have to say that I find it pretty strange to design a supposedly generic file-format with a 32bit signed integer length... , and not all BSON objects can be represented in JSON (if the BSON object has an array). Or do we invent our own flavors of one or both to cover the mismatch? The BSON representation could be purely internal... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Mon, Aug 16, 2010 at 7:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Well, if it's not just a binary encoding of JSON, I think we can forget about it ... certainly it won't work in the form I was visualizing. regards, tom lane I just read the spec, and BSON has a lot of bells and whistles attached (such as labeling binary data with a subtype like UUID or MD5). With a couple exceptions (see below), any JSON can be converted to BSON (but the way BSON does arrays is silly: item indices are stored as strings), but not all BSONs can be converted to JSON without losing some type details. Others already mentioned that you can't convert 2 billion byte long JSON strings to BSON. Another issue is that BSON cannot encode all JSON numbers without precision loss. JSON can hold any number matching '-'? (0 | [1-9][0-9]*) ('.' [0-9]+)? ([Ee] [+-]? [0-9]+)? but BSON pidgenholes numeric values to either double, int32, int64, or a 12-byte MongoDB Object ID. Thus, for people who expect JSON to be able to hold arbitrary-precision numbers (which the JSON data type in my patch can), using BSON for transfer or storage will violate that expectation. Now that I know more about BSON, my opinion is that it shouldn't be used as the transfer or storage format of the JSON data type. Maybe if someone wants to do the work, BSON could be implemented as a contrib module, and functions could be provided in that module to convert to/from JSON with documented caveats. Joey Adams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
but BSON pidgenholes numeric values to either double, int32, int64, or a 12-byte MongoDB Object ID. Thus, for people who expect JSON to be able to hold arbitrary-precision numbers (which the JSON data type in my patch can), using BSON for transfer or storage will violate that expectation. Good lord. I'd suggest that maybe we wait for BSON v. 2.0 instead. Is BSON even any kind of a standard? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Mon, Aug 16, 2010 at 8:38 PM, Josh Berkus j...@agliodbs.com wrote: but BSON pidgenholes numeric values to either double, int32, int64, or a 12-byte MongoDB Object ID. Thus, for people who expect JSON to be able to hold arbitrary-precision numbers (which the JSON data type in my patch can), using BSON for transfer or storage will violate that expectation. Good lord. I'd suggest that maybe we wait for BSON v. 2.0 instead. Is BSON even any kind of a standard? You know that if it were a standard, it would be WORSE! :-) This falls into big time no way! If there was a standardized binary encoding for XML that was effectively a tree (e.g. - more or less like a set of Lisp objects with CAR/CDR linkages), that would be somewhat interesting, as it could be both comparatively compact, and perhaps offer rapid navigation through the tree. BSON doesn't sound like that! -- http://linuxfinances.info/info/linuxdistributions.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Joseph Adams joeyadams3.14...@gmail.com writes: Others already mentioned that you can't convert 2 billion byte long JSON strings to BSON. Another issue is that BSON cannot encode all JSON numbers without precision loss. As somebody already mentioned, the former isn't likely to be an issue for us anytime in the foreseeable future, because we can't push around datum values more than 1GB large anyhow. The latter seems like a pretty nasty problem though. I'm good with just dropping this idea for the moment. The Google hit statistics that were cited earlier show that there's not enough interest in BSON to justify a separate datatype, which is what it would apparently need to be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
I'd originally sent this to Joseph Adams, as he has been working on adding a JSON datatype. I've suggested supporting BSON, as there are many client implementations available, and the format is more efficient than xml and json trees for some use cases. http://bsonspec.org/ On 8/15/10 11:27 AM, Joseph Adams wrote: On Sat, Aug 14, 2010 at 6:35 PM, Charles Pritchardch...@jumis.com wrote: Hello, I saw that you've submitted a patch to add JSON support to PostgreSQL. Thought I'd mention that BSON is gradually picking up in popularity, and does have a good amount of cross-client support. That said, I wouldn't be surprised if a patch were looked at with suspicion, but it does make sense as a contrib module. BSON is a binary json format used by Mongo DB. I don't use it, at all, but it seems like a good fit, and the right time. Returning BSON would be more efficient than returning JSON, for some use cases. -Charles Would you mind posting this to the pgsql-hackers@postgresql.org mailing list? I think a BSON module would be a good idea, as it would provide a more efficient alternative to the more text-oriented JSON data type. Joey Adams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
Charles Pritchard ch...@jumis.com writes: I'd originally sent this to Joseph Adams, as he has been working on adding a JSON datatype. I've suggested supporting BSON, as there are many client implementations available, I knew there would be a lot of critters crawling out as soon as we turned over this rock. Which other data-formats-of-the-week shall we immortalize as core PG types? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On 08/15/2010 11:03 PM, Tom Lane wrote: Charles Pritchardch...@jumis.com writes: I'd originally sent this to Joseph Adams, as he has been working on adding a JSON datatype. I've suggested supporting BSON, as there are many client implementations available, I knew there would be a lot of critters crawling out as soon as we turned over this rock. Which other data-formats-of-the-week shall we immortalize as core PG types? If BSON is simply in effect an efficient encoding of JSON, then it's not clear to me that we would want another type at all. Rather, we might want to consider storing the data in this supposedly more efficient format, and maybe also some conversion routines. I agree that we don't want in core a huge array of general serialization formats. The one thing that JSON has going for it for general use, in my view, is that, unlike hstore, the structure is not flat. That makes it potentially useful for various purposes, especially complex structured function arguments, in places where using hstore can be rather limiting, and xml overly verbose. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch for PostgreSQL - BSON Support?
On Sun, Aug 15, 2010 at 11:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Charles Pritchard ch...@jumis.com writes: I'd originally sent this to Joseph Adams, as he has been working on adding a JSON datatype. I've suggested supporting BSON, as there are many client implementations available, I knew there would be a lot of critters crawling out as soon as we turned over this rock. Which other data-formats-of-the-week shall we immortalize as core PG types? PER-encoded ASN.1, for when you really need something human-readable? :-) I think JSON is a reasonable choice for a core datatype; we don't really have anything else with the same functionality. But I'm not really in favor of adding any more, especially things like YAML and BSON that are essentially variants of JSON. Which is not to say I woudn't like to have those available, making the unproven assumption that someone wants to write the code, but I don't really see why they should be in core. My theory is that XML and JSON are the big two, and so far I haven't seen much evidence to the contrary. JSON also has the advantage, as compared with XML and in general, of being relatively simple and single-purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers