Greg Smith wrote:
The attached patch fixes all the issues I found in the original version of this code and completes the review I wanted to do. Someone else will need to take this from here. As I already mentioned, I can't comment on the quality of the piping implementation used to add this feature other than to say it worked for me.

The code below strikes me as being seriously broken.

The comment says that it doubles all single quotes, double quotes and backslashes. It doesn't (backslashes are not doubled) and in any case doing that would simply be wrong (with or without backslashes). The ONLY things that should be doubled are double quotes. Period.

That part is easy to fix.

But here's my question: why are we worrying at all about things like the encoding? Especially, why are we worrying about the *client* encoding for a server log file? We surely aren't going to switch encodings in the middle of a log file! ISTM that this routine should simply copy the input, byte for byte, apart from doubling the dquotes. Does that make sense? I assume that this routine has been copied from somewhere else without sufficient regard to the context (or the logic).

The code also illustrates something else that annoyed me elsewhere in the patch and that I have eliminated, namely use of a macro placed in the header file and then used in exactly one place in the code. The macro didn't make anything clearer - quite the reverse. ISTM that a macro used only in one file should be defined in that file, generally, and if it's used only once is probably not much use anyway.

cheers

andrew


+ + /*
+  * Escapes special characters in the string to conform
+  * with the csv type output.
+  * Replaces " with "", ' with '' and \ with \\.
+  */
+ static size_t
+ escape_string_literal(char *to, const char *from)
+ {
+       const char *source = from;
+       char       *target = to;
+       size_t          remaining = 0;
+       int     client_encoding = 0;
+ + if (from == NULL)
+               return remaining;
+               
+       remaining = strlen(from);
+       client_encoding = pg_get_client_encoding();
+ + while (remaining > 0 && *source != '\0')
+       {
+               char            c = *source;
+               int                     len;
+               int                     i;
+ + /* Fast path for plain ASCII */
+               if (!IS_HIGHBIT_SET(c))
+               {
+                       /* Apply quoting if needed */
+                       if (CSV_STR_DOUBLE(c, false))
+                               *target++ = c;
+                       /* Copy the character */
+                       *target++ = c;
+                       source++;
+                       remaining--;
+                       continue;
+               }
+ + /* Slow path for possible multibyte characters */
+               len = pg_encoding_mblen(client_encoding, source);
+ + /* Copy the character */
+               for (i = 0; i < len; i++)
+               {
+                       if (remaining == 0 || *source == '\0')
+                               break;
+                       *target++ = *source++;
+                       remaining--;
+               }
+ + /*
+                * If we hit premature end of string (ie, incomplete multibyte
+ * character), try to pad out to the correct length with spaces. + * We may not be able to pad completely, but we will always be + * able to insert at least one pad space (since we'd not have + * quoted a multibyte character). This should be enough to make
+                * a string that the server will error out on.
+                */
+               if (i < len)
+               {
+                       for (; i < len; i++)
+                       {
+                               if (((size_t) (target - to)) / 2 >= 
strlen(from))
+                                       break;
+                               *target++ = ' ';
+                       }
+                       break;
+               }
+       }
+ + /* Write the terminating NUL character. */
+       *target = '\0';
+ + return target - to;
+ }


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to