Re: [PATCHES] [HACKERS] Unworkable column delimiter characters for COPY

2007-12-28 Thread Andrew Dunstan



Tom Lane wrote:

I see that we disallow the CSV quote character from appearing in the
null_print string, but not the escape character.  Is this
correct/sensible?  


Yes, because:
. nulls are never quoted
. fields containing the quote char must be quoted
. the escape char is only magical inside quoted fields



If it is correct, maybe the delimiter could also
match the escape character?

  


Yes, probably. Crazy, but I think it's workable. I'll take that test 
out, and just make sure that the delimiter is different from the quote.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] [HACKERS] Unworkable column delimiter characters for COPY

2007-12-27 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


I think at minimum we need to forbid b, f, n, r, t, v, which are the
control character representations currently recognized by COPY.
But I'm tempted to make it reject all 26 lower-case ASCII letters,
as a form of future-proofing.  Thoughts?
  


  

Assuming this is only for non-CSV mode, it seems OK.



On looking closer, 'x', octal digits, and '.' would also be trouble.
So I made it reject a-z, 0-9, and dot.

It appears that the CSV mode is a few bricks shy of a load here as
well: it will let you do CSV DELIMITER '' resulting in entirely
broken output.  It seems we ought to forbid delimiter from matching CSV
quote or escape characters.  I'll let you clean up that case though...
  



This should do the trick - I'll apply it tomorrow.

cheers

andrew

Index: copy.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.293
diff -c -r1.293 copy.c
*** copy.c  27 Dec 2007 18:28:58 -  1.293
--- copy.c  28 Dec 2007 04:07:06 -
***
*** 889,894 
--- 889,907 
   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg(COPY delimiter cannot be 
\%s\, cstate-delim)));


+   /* In CSV mode, disallow quote or escape chars as delimiter */
+   if (cstate-csv_mode)
+   {
+   if (cstate-delim[0] == cstate-quote[0])
+   ereport(ERROR,
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(COPY delimiter and 
quote must be different)));

+   else if (cstate-delim[0] == cstate-escape[0])
+   ereport(ERROR,
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(COPY delimiter and 
escape must be different)));

+   }
+
   /* Check header */
   if (!cstate-csv_mode  cstate-header_line)
   ereport(ERROR,



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] Unworkable column delimiter characters for COPY

2007-12-27 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 It seems we ought to forbid delimiter from matching CSV
 quote or escape characters.  I'll let you clean up that case though...

 This should do the trick - I'll apply it tomorrow.

A couple thoughts:

* This test needs to appear further down --- it is not sensible until
after you've checked strlen() == 1 for all the strings involved.

* I see that we disallow the CSV quote character from appearing in the
null_print string, but not the escape character.  Is this
correct/sensible?  If it is correct, maybe the delimiter could also
match the escape character?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend