Greetings,

  Please find attached a minor patch to remove the constraints that a
  user can't include the delimiter or quote characters in a 'NULL AS'
  string when importing CSV files.

  This allows a user to explicitly request that NULL conversion happen
  on fields which are quoted.  As the quote character is being allowed
  to be in the 'NULL AS' string now, there's no reason to exclude the
  delimiter character from being seen in that string as well, though
  unless quoted using the CSV quote character it won't ever be matched.

  An example of the usage:

  sfrost*=# \copy billing_data from ~/BillingSamplePricerFile.csv
            with csv header quote as '"' null as '""'

  This is no contrived example, it's an issue I ran into earlier today
  when I got a file which had (for reasons unknown to me and not easily
  changed upstream): 
  
  "1","V","WASHDCABC12","","120033"...

  Both of the ending columns shown are integer fields, the "" here being
  used to indicate a NULL value.

  Without the patch, an ERROR occurs:

  sfrost=> \copy billing_data from ~/BillingSamplePricerFile.csv 
            with csv header quote as '"'
  ERROR:  invalid input syntax for integer: ""

  And there's no way to get it to import with COPY CSV mode.  The
  patch adds this ability without affecting existing usage or changing
  the syntax.  Even with the patch an ERROR occurs with the default
  treatment of CSV files:

  sfrost=# \copy billing_data from ~/BillingSamplePricerFile.csv
           with csv header quote as '"'
  ERROR:  invalid input syntax for integer: ""

  Which would be expected.  If the file is modified to remove the ""s
  for NULL columns, it imports just fine with the syntax above.

  It'd be really nice to have this included.

        Thanks!

                Stephen
Index: src/backend/commands/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.285
diff -c -r1.285 copy.c
*** src/backend/commands/copy.c	20 Jun 2007 02:02:49 -0000	1.285
--- src/backend/commands/copy.c	27 Jul 2007 04:13:15 -0000
***************
*** 926,944 ****
  				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  			  errmsg("COPY force not null only available using COPY FROM")));
  
- 	/* Don't allow the delimiter to appear in the null string. */
- 	if (strchr(cstate->null_print, cstate->delim[0]) != NULL)
- 		ereport(ERROR,
- 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- 		errmsg("COPY delimiter must not appear in the NULL specification")));
- 
- 	/* Don't allow the CSV quote char to appear in the null string. */
- 	if (cstate->csv_mode &&
- 		strchr(cstate->null_print, cstate->quote[0]) != NULL)
- 		ereport(ERROR,
- 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- 				 errmsg("CSV quote character must not appear in the NULL specification")));
- 
  	/* Disallow file COPY except to superusers. */
  	if (!pipe && !superuser())
  		ereport(ERROR,
--- 926,931 ----
***************
*** 2888,2894 ****
  	{
  		bool		found_delim = false;
  		bool		in_quote = false;
- 		bool		saw_quote = false;
  		char	   *start_ptr;
  		char	   *end_ptr;
  		int			input_len;
--- 2875,2880 ----
***************
*** 2921,2927 ****
  			/* start of quoted field (or part of field) */
  			if (c == quotec && !in_quote)
  			{
- 				saw_quote = true;
  				in_quote = true;
  				continue;
  			}
--- 2907,2912 ----
***************
*** 2971,2977 ****
  
  		/* Check whether raw input matched null marker */
  		input_len = end_ptr - start_ptr;
! 		if (!saw_quote && input_len == cstate->null_print_len &&
  			strncmp(start_ptr, cstate->null_print, input_len) == 0)
  			fieldvals[fieldno] = NULL;
  
--- 2956,2962 ----
  
  		/* Check whether raw input matched null marker */
  		input_len = end_ptr - start_ptr;
! 		if (input_len == cstate->null_print_len &&
  			strncmp(start_ptr, cstate->null_print, input_len) == 0)
  			fieldvals[fieldno] = NULL;
  
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Reply via email to