Fwd: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
Forwarding because the mailing list rejected the original message. -- Forwarded message -- From: Joey Adams joeyadams3.14...@gmail.com Date: Tue, Jul 19, 2011 at 11:23 PM Subject: Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON To: Alvaro Herrera alvhe...@commandprompt.com Cc: Florian Pflug f...@phlo.org, Tom Lane t...@sss.pgh.pa.us, Robert Haas robertmh...@gmail.com, Bernd Helmle maili...@oopsware.de, Dimitri Fontaine dimi...@2ndquadrant.fr, David Fetter da...@fetter.org, Josh Berkus j...@agliodbs.com, Pg Hackers pgsql-hackers@postgresql.org On Tue, Jul 19, 2011 at 10:01 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Would it work to have a separate entry point into mbutils.c that lets you cache the conversion proc caller-side? That sounds like a really good idea. There's still the overhead of calling the proc, but I imagine it's a lot less than looking it up. I think the main problem is determining the byte length of each source character beforehand. I'm not sure what you mean. The idea is to convert the \u escape to UTF-8 with unicode_to_utf8 (the length of the resulting UTF-8 sequence is easy to compute), call the conversion proc to get the null-terminated database-encoded character, then append the result to whatever StringInfo the string is going into. The only question mark is how big the destination buffer will need to be. The maximum number of bytes per char in any supported encoding is 4, but is it possible for one Unicode character to turn into multiple characters in the database encoding? While we're at it, should we provide the same capability to the SQL parser? Namely, the ability to use \u escapes above U+007F when the server encoding is not UTF-8? - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
Joey Adams wrote: Forwarding because the mailing list rejected the original message. Yes, I am seeing email failures to the 'core' email list. --- -- Forwarded message -- From: Joey Adams joeyadams3.14...@gmail.com Date: Tue, Jul 19, 2011 at 11:23 PM Subject: Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON To: Alvaro Herrera alvhe...@commandprompt.com Cc: Florian Pflug f...@phlo.org, Tom Lane t...@sss.pgh.pa.us, Robert Haas robertmh...@gmail.com, Bernd Helmle maili...@oopsware.de, Dimitri Fontaine dimi...@2ndquadrant.fr, David Fetter da...@fetter.org, Josh Berkus j...@agliodbs.com, Pg Hackers pgsql-hackers@postgresql.org On Tue, Jul 19, 2011 at 10:01 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Would it work to have a separate entry point into mbutils.c that lets you cache the conversion proc caller-side? That sounds like a really good idea. ?There's still the overhead of calling the proc, but I imagine it's a lot less than looking it up. I think the main problem is determining the byte length of each source character beforehand. I'm not sure what you mean. ?The idea is to convert the \u escape to UTF-8 with unicode_to_utf8 (the length of the resulting UTF-8 sequence is easy to compute), call the conversion proc to get the null-terminated database-encoded character, then append the result to whatever StringInfo the string is going into. The only question mark is how big the destination buffer will need to be. ?The maximum number of bytes per char in any supported encoding is 4, but is it possible for one Unicode character to turn into multiple characters in the database encoding? While we're at it, should we provide the same capability to the SQL parser? ?Namely, the ability to use \u escapes above U+007F when the server encoding is not UTF-8? - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
Bruce Momjian wrote: Joey Adams wrote: Forwarding because the mailing list rejected the original message. Yes, I am seeing email failures to the 'core' email list. Marc says it is now fixed. --- -- Forwarded message -- From: Joey Adams joeyadams3.14...@gmail.com Date: Tue, Jul 19, 2011 at 11:23 PM Subject: Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON To: Alvaro Herrera alvhe...@commandprompt.com Cc: Florian Pflug f...@phlo.org, Tom Lane t...@sss.pgh.pa.us, Robert Haas robertmh...@gmail.com, Bernd Helmle maili...@oopsware.de, Dimitri Fontaine dimi...@2ndquadrant.fr, David Fetter da...@fetter.org, Josh Berkus j...@agliodbs.com, Pg Hackers pgsql-hackers@postgresql.org On Tue, Jul 19, 2011 at 10:01 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Would it work to have a separate entry point into mbutils.c that lets you cache the conversion proc caller-side? That sounds like a really good idea. ?There's still the overhead of calling the proc, but I imagine it's a lot less than looking it up. I think the main problem is determining the byte length of each source character beforehand. I'm not sure what you mean. ?The idea is to convert the \u escape to UTF-8 with unicode_to_utf8 (the length of the resulting UTF-8 sequence is easy to compute), call the conversion proc to get the null-terminated database-encoded character, then append the result to whatever StringInfo the string is going into. The only question mark is how big the destination buffer will need to be. ?The maximum number of bytes per char in any supported encoding is 4, but is it possible for one Unicode character to turn into multiple characters in the database encoding? While we're at it, should we provide the same capability to the SQL parser? ?Namely, the ability to use \u escapes above U+007F when the server encoding is not UTF-8? - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers