Re: [PATCHES] boolean <=> text explicit casts

2007-06-19 Thread Jim Nasby

On May 30, 2007, at 3:40 PM, Neil Conway wrote:

On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote:

I'm not sure what your rationale was for creating lower-case words
instead of upper case, except for it looks nicer.  Is there a  
technical

reason?


There's no real technical reason: the standard says upper-case, but  
PG's
general philosophy of case folding would suggest folding to lower- 
case.

If we were compliant with the spec's case folding requirements then
emitting uppercase would be the clear choice, but since we aren't, I
don't have strong feelings either way.


Sorry for the late reply...

I'm worried that this would make us incompatible with cross-database  
code. Granted, should probably be using a boolean representation, but  
I'm not sure if that's universally true. And if we find out later  
that lower case is a problem, it won't be possible to change it  
without messing with the rest of our users. I think it'd be best to  
go with the spec.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] boolean <=> text explicit casts

2007-05-30 Thread Neil Conway
On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote:
> I'm not sure what your rationale was for creating lower-case words 
> instead of upper case, except for it looks nicer.  Is there a technical 
> reason?

There's no real technical reason: the standard says upper-case, but PG's
general philosophy of case folding would suggest folding to lower-case.
If we were compliant with the spec's case folding requirements then
emitting uppercase would be the clear choice, but since we aren't, I
don't have strong feelings either way.

-Neil



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] boolean <=> text explicit casts

2007-05-30 Thread Peter Eisentraut
Neil Conway wrote:
> Attached is a revised version of this patch that modifies boolin() to
> ignore leading and trailing whitespace. This makes text => boolean
> trivial, but boolean => text is still distinct from boolout().

I'm not sure what your rationale was for creating lower-case words 
instead of upper case, except for it looks nicer.  Is there a technical 
reason?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] boolean <=> text explicit casts

2007-05-30 Thread Neil Conway
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
> More generally, I'm really hoping to get rid of bespoke text<->whatever
> cast functions in favor of using datatypes' I/O functions.  To what
> extent can we make the boolean I/O functions serve for this?  It seems
> relatively painless on the input side --- just allow whitespace --- but
> I suppose we can't change boolout's historical result of "t"/"f" without
> causing problems.

Attached is a revised version of this patch that modifies boolin() to
ignore leading and trailing whitespace. This makes text => boolean
trivial, but boolean => text is still distinct from boolout().

Barring any objections, I'll apply this later today or tomorrow.

-Neil

Index: src/backend/utils/adt/bool.c
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/backend/utils/adt/bool.c,v
retrieving revision 1.38
diff -p -c -r1.38 bool.c
*** src/backend/utils/adt/bool.c	5 Jan 2007 22:19:40 -	1.38
--- src/backend/utils/adt/bool.c	30 May 2007 18:55:26 -
***
*** 15,20 
--- 15,22 
  
  #include "postgres.h"
  
+ #include 
+ 
  #include "libpq/pqformat.h"
  #include "utils/builtins.h"
  
***
*** 33,73 
  Datum
  boolin(PG_FUNCTION_ARGS)
  {
! 	char	   *b = PG_GETARG_CSTRING(0);
  
! 	switch (*b)
  	{
  		case 't':
  		case 'T':
! 			if (pg_strncasecmp(b, "true", strlen(b)) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case 'f':
  		case 'F':
! 			if (pg_strncasecmp(b, "false", strlen(b)) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
  		case 'y':
  		case 'Y':
! 			if (pg_strncasecmp(b, "yes", strlen(b)) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case '1':
! 			if (pg_strncasecmp(b, "1", strlen(b)) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case 'n':
  		case 'N':
! 			if (pg_strncasecmp(b, "no", strlen(b)) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
  		case '0':
! 			if (pg_strncasecmp(b, "0", strlen(b)) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
--- 35,88 
  Datum
  boolin(PG_FUNCTION_ARGS)
  {
! 	const char	*in_str = PG_GETARG_CSTRING(0);
! 	const char 	*str;
! 	size_t 		 len;
! 
! 	/*
! 	 * Skip leading and trailing whitespace
! 	 */
! 	str = in_str;
! 	while (isspace((unsigned char) *str))
! 		str++;
! 
! 	len = strlen(str);
! 	while (len > 0 && isspace((unsigned char) str[len - 1]))
! 		len--;
  
! 	switch (*str)
  	{
  		case 't':
  		case 'T':
! 			if (pg_strncasecmp(str, "true", len) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case 'f':
  		case 'F':
! 			if (pg_strncasecmp(str, "false", len) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
  		case 'y':
  		case 'Y':
! 			if (pg_strncasecmp(str, "yes", len) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case '1':
! 			if (pg_strncasecmp(str, "1", len) == 0)
  PG_RETURN_BOOL(true);
  			break;
  
  		case 'n':
  		case 'N':
! 			if (pg_strncasecmp(str, "no", len) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
  		case '0':
! 			if (pg_strncasecmp(str, "0", len) == 0)
  PG_RETURN_BOOL(false);
  			break;
  
*** boolin(PG_FUNCTION_ARGS)
*** 77,83 
  
  	ereport(ERROR,
  			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
! 			 errmsg("invalid input syntax for type boolean: \"%s\"", b)));
  
  	/* not reached */
  	PG_RETURN_BOOL(false);
--- 92,98 
  
  	ereport(ERROR,
  			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
! 			 errmsg("invalid input syntax for type boolean: \"%s\"", in_str)));
  
  	/* not reached */
  	PG_RETURN_BOOL(false);
*** boolsend(PG_FUNCTION_ARGS)
*** 127,132 
--- 142,178 
  	PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
  }
  
+ /*
+  *	textbool			- cast function for text => bool
+  */
+ Datum
+ textbool(PG_FUNCTION_ARGS)
+ {
+ 	Datum 		 in_text = PG_GETARG_DATUM(0);
+ 	char 		*str;
+ 
+ 	str = DatumGetCString(DirectFunctionCall1(textout, in_text));
+ 
+ 	PG_RETURN_DATUM(DirectFunctionCall1(boolin, CStringGetDatum(str)));
+ }
+ 
+ /*
+  *	booltext			- cast function for bool => text
+  */
+ Datum
+ booltext(PG_FUNCTION_ARGS)
+ {
+ 	bool 		 arg1 = PG_GETARG_BOOL(0);
+ 	char 		*str;
+ 
+ 	if (arg1)
+ 		str = "true";
+ 	else
+ 		str = "false";
+ 
+ 	PG_RETURN_DATUM(DirectFunctionCall1(textin, CStringGetDatum(str)));
+ }
+ 
  
  /*
   *	 PUBLIC ROUTINES		 *
Index: src/include/catalog/pg_cast.h
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_cast.h,v
retrieving revision 1.32
diff -p -c -r1.32 pg_cast.h
*** src/include/catalog/pg_cast.h	2 Apr 2007 03:49:40 -	1.32
--- src/include/catalog/pg_cast.h	30 May 2007 17:59:20 -
*** DATA(insert ( 1700	 25 1688 i ));
*** 302,307 
--- 302,309 
  DATA(insert (	25 1700 1686 e ));
  DATA(insert (  142   25 2922 e ));
  DATA(insert (   25  142	2896 e ));

Re: [PATCHES] boolean <=> text explicit casts

2007-05-28 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
>> More generally, I'm really hoping to get rid of bespoke text<->whatever
>> cast functions in favor of using datatypes' I/O functions.

> I don't object, but I'm curious: is there a benefit to this other than
> brevity of implementation? ISTM the spec has the idea that the input to
> a type's constructor is often distinct from the type's text => type
> casting behavior.

Well, (a) it would fill in a whole lot of text-conversion cases that are
currently missing, and (b) it would encourage datatype implementors to
keep the I/O and text-conversion cases behaving alike unless there were
a REALLY good reason not to.  IMHO most of the cases that the SQL spec
calls out as behaving differently are pure brain-damage.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] boolean <=> text explicit casts

2007-05-28 Thread Neil Conway
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
> More generally, I'm really hoping to get rid of bespoke text<->whatever
> cast functions in favor of using datatypes' I/O functions.

I don't object, but I'm curious: is there a benefit to this other than
brevity of implementation? ISTM the spec has the idea that the input to
a type's constructor is often distinct from the type's text => type
casting behavior.

-Neil



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] boolean <=> text explicit casts

2007-05-28 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> (2) The spec also requires that boolean::varchar(n) should raise an
> error if "n" is not large enough to accomodate the textual
> representation of the boolean value.

Really?  That's in direct contradiction to the "normal" spec-required
behavior of casting to varchar(n).  I'd suggest ignoring it on the
grounds that the SQL committee have forgotten what they wrote
themselves.

> (3) The spec suggests that true/false should be upper-cased when
> converted to text, so that's what I've implemented, but one could argue
> that converting to lower-case would be more consistent with PG's general
> approach to case folding.

hm, +1 for lower case myself, but not dead set on it.

More generally, I'm really hoping to get rid of bespoke text<->whatever
cast functions in favor of using datatypes' I/O functions.  To what
extent can we make the boolean I/O functions serve for this?  It seems
relatively painless on the input side --- just allow whitespace --- but
I suppose we can't change boolout's historical result of "t"/"f" without
causing problems.

Also, invoking btrim() seems an exceedingly expensive way of ignoring a
bit of whitespace.  I suppose inefficiency in a seldom-used cast
function does not matter, but please don't do it that way in boolin.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] boolean <=> text explicit casts

2007-05-28 Thread Neil Conway
I noticed that SQL:2003 specifies explicit casts between "boolean" and
the character string types. Attached is a patch that implements them,
and adds some simple regression tests.

A few points worth noting:

(1) The SQL spec requires that text::boolean trim leading and trailing
whitespace from the input

(2) The spec also requires that boolean::varchar(n) should raise an
error if "n" is not large enough to accomodate the textual
representation of the boolean value. We currently truncate:

=> select true::boolean::varchar(3);
 varchar 
-
 TRU

Not sure offhand if there's an easy way to satisfy the spec's
requirement...

(3) The spec suggests that true/false should be upper-cased when
converted to text, so that's what I've implemented, but one could argue
that converting to lower-case would be more consistent with PG's general
approach to case folding.

-Neil

Index: src/backend/utils/adt/bool.c
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/backend/utils/adt/bool.c,v
retrieving revision 1.38
diff -p -c -r1.38 bool.c
*** src/backend/utils/adt/bool.c	5 Jan 2007 22:19:40 -	1.38
--- src/backend/utils/adt/bool.c	28 May 2007 18:47:59 -
*** boolsend(PG_FUNCTION_ARGS)
*** 127,132 
--- 127,177 
  	PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
  }
  
+ /*
+  *	textbool			- cast function for text => bool
+  */
+ Datum
+ textbool(PG_FUNCTION_ARGS)
+ {
+ 	text 		*arg1 = PG_GETARG_TEXT_P(0);
+ 	text 		*trim;
+ 	char 		*trim_str;
+ 
+ 	trim = DatumGetTextP(DirectFunctionCall1(btrim1,
+ 			 PointerGetDatum(arg1)));
+ 	trim_str = DatumGetCString(DirectFunctionCall1(textout,
+    PointerGetDatum(trim)));
+ 
+ 	if (pg_strcasecmp(trim_str, "true") == 0)
+ 		PG_RETURN_BOOL(true);
+ 	else if (pg_strcasecmp(trim_str, "false") == 0)
+ 		PG_RETURN_BOOL(false);
+ 	else
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_CHARACTER_VALUE_FOR_CAST),
+  errmsg("invalid input syntax for type boolean: \"%s\"",
+ 		trim_str)));
+ 
+ 	PG_RETURN_BOOL(false);		/* keep the compiler quiet */
+ }
+ 
+ /*
+  *	booltext			- cast function for bool => text
+  */
+ Datum
+ booltext(PG_FUNCTION_ARGS)
+ {
+ 	bool 		 arg1 = PG_GETARG_BOOL(0);
+ 	char 		*str;
+ 
+ 	if (arg1)
+ 		str = "TRUE";
+ 	else
+ 		str = "FALSE";
+ 
+ 	PG_RETURN_DATUM(DirectFunctionCall1(textin, CStringGetDatum(str)));
+ }
+ 
  
  /*
   *	 PUBLIC ROUTINES		 *
Index: src/include/catalog/pg_cast.h
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_cast.h,v
retrieving revision 1.32
diff -p -c -r1.32 pg_cast.h
*** src/include/catalog/pg_cast.h	2 Apr 2007 03:49:40 -	1.32
--- src/include/catalog/pg_cast.h	28 May 2007 18:41:38 -
*** DATA(insert ( 1700	 25 1688 i ));
*** 302,307 
--- 302,309 
  DATA(insert (	25 1700 1686 e ));
  DATA(insert (  142   25 2922 e ));
  DATA(insert (   25  142	2896 e ));
+ DATA(insert (   16   25 2971 e ));
+ DATA(insert (   25   16 2970 e ));
  
  /*
   * Cross-category casts to and from VARCHAR
*** DATA(insert ( 1700 1043 1688 a ));
*** 342,347 
--- 344,351 
  DATA(insert ( 1043 1700 1686 e ));
  DATA(insert (  142 1043 2922 e ));
  DATA(insert ( 1043  142 2896 e ));
+ DATA(insert (   16 1043 2971 e ));
+ DATA(insert ( 1043   16 2970 e ));
  
  /*
   * Cross-category casts to and from BPCHAR
Index: src/include/catalog/pg_proc.h
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.456
diff -p -c -r1.456 pg_proc.h
*** src/include/catalog/pg_proc.h	21 May 2007 17:10:29 -	1.456
--- src/include/catalog/pg_proc.h	28 May 2007 18:40:53 -
*** DESCR("List all files in a directory");
*** 3221,3226 
--- 3221,3230 
  DATA(insert OID = 2626 ( pg_sleep			PGNSP PGUID 12 1 0 f f t f v 1 2278 "701" _null_ _null_ _null_ pg_sleep - _null_ ));
  DESCR("Sleep for the specified time in seconds");
  
+ DATA(insert OID = 2970 (  boolean			PGNSP PGUID 12 1 0 f f t f i 1 16 "25" _null_ _null_ _null_	textbool - _null_ ));
+ DESCR("text to boolean");
+ DATA(insert OID = 2971 (  textPGNSP PGUID 12 1 0 f f t f i 1 25 "16" _null_ _null_ _null_	booltext - _null_ ));
+ DESCR("boolean to text");
  
  /* Aggregates (moved here from pg_aggregate for 7.3) */
  
Index: src/include/utils/builtins.h
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/utils/builtins.h,v
retrieving revision 1.293
diff -p -c -r1.293 builtins.h
*** src/include/utils/builtins.h	17 May 2007 23:31:49 -	1.293
--- src/include/utils/builtins.h	28 May 2007 18:13:15 -
*** extern Datum boolin(PG_FUNCTION_ARGS);
*** 70,75 
--- 70,77