--On 18. Juni 2011 12:29:38 +0200 Bernd Helmle <maili...@oopsware.de> wrote:

Similar problems occur with a couple other modules I tried (hstore,
intarray).

Hmm, works for me. Seems you have messed up your installation in some way
(build against current -HEAD but running against a 9.1?).

I'm going to review in the next couple of days again.

A bit later than expected, but here is my summary on the patch:

-- Patch Status

The patch applies cleanly. Since it's primarily targeted as an addition to contrib/, it doesn't touch the backend source tree (besides documentation TOC entries), but adds a new subdirectory in contrib/json. The patch is in context diff as requested.

-- Functionality

The patch as is provides a basic implementation for a JSON datatype. It includes normalization and validation of JSON strings. It adds the datatype implementation as an extension.

The implementation focus on the datatype functionality only, there is no additional support for special operators. Thus, i think the comments in the control file (and docs) promises actually more than the contrib module will deliver:

+comment = 'data type for storing and manipulating JSON content'

I'm not sure, if "manipulating" is a correct description. Maybe i missed it, but i didn't see functions to manipulate JSON strings directly, for example, adding an object to a JSON string, ...

-- Coding

The JSON datatype is implemented as a variable length character type, thus allows very large JSON strings to be stored. It transparently decides when to escape unicode code points depending on the selected client- and server-encoding.

Validation is done through its own JSON parser. The parser itself is a recursive descent parser. It is noticable that the parser seems to define its own is_space() and is_digit() functions, e.g.:

+#define is_space(c) ((c) == '\t' || (c) == '\n' || (c) == '\r' || (c) == ' ')

Wouldn't it be more clean to use isspace() instead?

This code block in function json_escape_unicode() looks suspicious:

+   /* Convert to UTF-8, if necessary. */
+   {
+       const char *orig = json;
+       json = (const char *)
+           pg_do_encoding_conversion((unsigned char *) json, length,
+                                     GetDatabaseEncoding(), PG_UTF8);
+       if (json != orig)
+           length = strlen(json);
+   }

Seems it *always* wants to convert the string. Isn't there a "if" condition missing?

There's a commented code fragment missing, which should be removed (see last two lines of the snippet below):

+static unsigned int
+read_hex16(const char *in)
+{
+       unsigned int i;
+       unsigned int tmp;
+       unsigned int ret = 0;
+
+       for (i = 0; i < 4; i++)
+       {
+               char c = *in++;
+
+               Assert(is_hex_digit(c));
+
+               if (c >= '0' && c <= '9')
+                       tmp = c - '0';
+               else if (c >= 'A' && c <= 'F')
+                       tmp = c - 'A' + 10;
+               else /* if (c >= 'a' && c <= 'f') */
+                       tmp = c - 'a' + 10;

json.c introduces new appendStringInfo* functions, e.g.
appendStringInfoEscape() and appendStringInfoUtf8(). Maybe it's better
to name them to something different, since it may occur someday that the backend
will introduce the same notion with a different meaning. They are static,
but why not naming them into something like jsonAppendStringInfoUtf8() or similar?

-- Regression Tests

The patch includes a fairly complete regression test suite, which covers much
of the formatting, validating and input functionality of the JSON datatype. It also tests UNICODE sequences and input encoding with other server encoding than UTF-8.


--
Thanks

        Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to