Re: [PATCHES] boolean <=> text explicit casts
On May 30, 2007, at 3:40 PM, Neil Conway wrote: On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote: I'm not sure what your rationale was for creating lower-case words instead of upper case, except for it looks nicer. Is there a technical reason? There's no real technical reason: the standard says upper-case, but PG's general philosophy of case folding would suggest folding to lower- case. If we were compliant with the spec's case folding requirements then emitting uppercase would be the clear choice, but since we aren't, I don't have strong feelings either way. Sorry for the late reply... I'm worried that this would make us incompatible with cross-database code. Granted, should probably be using a boolean representation, but I'm not sure if that's universally true. And if we find out later that lower case is a problem, it won't be possible to change it without messing with the rest of our users. I think it'd be best to go with the spec. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] boolean <=> text explicit casts
On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote: > I'm not sure what your rationale was for creating lower-case words > instead of upper case, except for it looks nicer. Is there a technical > reason? There's no real technical reason: the standard says upper-case, but PG's general philosophy of case folding would suggest folding to lower-case. If we were compliant with the spec's case folding requirements then emitting uppercase would be the clear choice, but since we aren't, I don't have strong feelings either way. -Neil ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] boolean <=> text explicit casts
Neil Conway wrote: > Attached is a revised version of this patch that modifies boolin() to > ignore leading and trailing whitespace. This makes text => boolean > trivial, but boolean => text is still distinct from boolout(). I'm not sure what your rationale was for creating lower-case words instead of upper case, except for it looks nicer. Is there a technical reason? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] boolean <=> text explicit casts
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote: > More generally, I'm really hoping to get rid of bespoke text<->whatever > cast functions in favor of using datatypes' I/O functions. To what > extent can we make the boolean I/O functions serve for this? It seems > relatively painless on the input side --- just allow whitespace --- but > I suppose we can't change boolout's historical result of "t"/"f" without > causing problems. Attached is a revised version of this patch that modifies boolin() to ignore leading and trailing whitespace. This makes text => boolean trivial, but boolean => text is still distinct from boolout(). Barring any objections, I'll apply this later today or tomorrow. -Neil Index: src/backend/utils/adt/bool.c === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/backend/utils/adt/bool.c,v retrieving revision 1.38 diff -p -c -r1.38 bool.c *** src/backend/utils/adt/bool.c 5 Jan 2007 22:19:40 - 1.38 --- src/backend/utils/adt/bool.c 30 May 2007 18:55:26 - *** *** 15,20 --- 15,22 #include "postgres.h" + #include + #include "libpq/pqformat.h" #include "utils/builtins.h" *** *** 33,73 Datum boolin(PG_FUNCTION_ARGS) { ! char *b = PG_GETARG_CSTRING(0); ! switch (*b) { case 't': case 'T': ! if (pg_strncasecmp(b, "true", strlen(b)) == 0) PG_RETURN_BOOL(true); break; case 'f': case 'F': ! if (pg_strncasecmp(b, "false", strlen(b)) == 0) PG_RETURN_BOOL(false); break; case 'y': case 'Y': ! if (pg_strncasecmp(b, "yes", strlen(b)) == 0) PG_RETURN_BOOL(true); break; case '1': ! if (pg_strncasecmp(b, "1", strlen(b)) == 0) PG_RETURN_BOOL(true); break; case 'n': case 'N': ! if (pg_strncasecmp(b, "no", strlen(b)) == 0) PG_RETURN_BOOL(false); break; case '0': ! if (pg_strncasecmp(b, "0", strlen(b)) == 0) PG_RETURN_BOOL(false); break; --- 35,88 Datum boolin(PG_FUNCTION_ARGS) { ! const char *in_str = PG_GETARG_CSTRING(0); ! const char *str; ! size_t len; ! ! /* ! * Skip leading and trailing whitespace ! */ ! str = in_str; ! while (isspace((unsigned char) *str)) ! str++; ! ! len = strlen(str); ! while (len > 0 && isspace((unsigned char) str[len - 1])) ! len--; ! switch (*str) { case 't': case 'T': ! if (pg_strncasecmp(str, "true", len) == 0) PG_RETURN_BOOL(true); break; case 'f': case 'F': ! if (pg_strncasecmp(str, "false", len) == 0) PG_RETURN_BOOL(false); break; case 'y': case 'Y': ! if (pg_strncasecmp(str, "yes", len) == 0) PG_RETURN_BOOL(true); break; case '1': ! if (pg_strncasecmp(str, "1", len) == 0) PG_RETURN_BOOL(true); break; case 'n': case 'N': ! if (pg_strncasecmp(str, "no", len) == 0) PG_RETURN_BOOL(false); break; case '0': ! if (pg_strncasecmp(str, "0", len) == 0) PG_RETURN_BOOL(false); break; *** boolin(PG_FUNCTION_ARGS) *** 77,83 ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), ! errmsg("invalid input syntax for type boolean: \"%s\"", b))); /* not reached */ PG_RETURN_BOOL(false); --- 92,98 ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), ! errmsg("invalid input syntax for type boolean: \"%s\"", in_str))); /* not reached */ PG_RETURN_BOOL(false); *** boolsend(PG_FUNCTION_ARGS) *** 127,132 --- 142,178 PG_RETURN_BYTEA_P(pq_endtypsend(&buf)); } + /* + * textbool - cast function for text => bool + */ + Datum + textbool(PG_FUNCTION_ARGS) + { + Datum in_text = PG_GETARG_DATUM(0); + char *str; + + str = DatumGetCString(DirectFunctionCall1(textout, in_text)); + + PG_RETURN_DATUM(DirectFunctionCall1(boolin, CStringGetDatum(str))); + } + + /* + * booltext - cast function for bool => text + */ + Datum + booltext(PG_FUNCTION_ARGS) + { + bool arg1 = PG_GETARG_BOOL(0); + char *str; + + if (arg1) + str = "true"; + else + str = "false"; + + PG_RETURN_DATUM(DirectFunctionCall1(textin, CStringGetDatum(str))); + } + /* * PUBLIC ROUTINES * Index: src/include/catalog/pg_cast.h === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_cast.h,v retrieving revision 1.32 diff -p -c -r1.32 pg_cast.h *** src/include/catalog/pg_cast.h 2 Apr 2007 03:49:40 - 1.32 --- src/include/catalog/pg_cast.h 30 May 2007 17:59:20 - *** DATA(insert ( 1700 25 1688 i )); *** 302,307 --- 302,309 DATA(insert ( 25 1700 1686 e )); DATA(insert ( 142 25 2922 e )); DATA(insert ( 25 142 2896 e ));
Re: [PATCHES] boolean <=> text explicit casts
Neil Conway <[EMAIL PROTECTED]> writes: > On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote: >> More generally, I'm really hoping to get rid of bespoke text<->whatever >> cast functions in favor of using datatypes' I/O functions. > I don't object, but I'm curious: is there a benefit to this other than > brevity of implementation? ISTM the spec has the idea that the input to > a type's constructor is often distinct from the type's text => type > casting behavior. Well, (a) it would fill in a whole lot of text-conversion cases that are currently missing, and (b) it would encourage datatype implementors to keep the I/O and text-conversion cases behaving alike unless there were a REALLY good reason not to. IMHO most of the cases that the SQL spec calls out as behaving differently are pure brain-damage. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] boolean <=> text explicit casts
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote: > More generally, I'm really hoping to get rid of bespoke text<->whatever > cast functions in favor of using datatypes' I/O functions. I don't object, but I'm curious: is there a benefit to this other than brevity of implementation? ISTM the spec has the idea that the input to a type's constructor is often distinct from the type's text => type casting behavior. -Neil ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] boolean <=> text explicit casts
Neil Conway <[EMAIL PROTECTED]> writes: > (2) The spec also requires that boolean::varchar(n) should raise an > error if "n" is not large enough to accomodate the textual > representation of the boolean value. Really? That's in direct contradiction to the "normal" spec-required behavior of casting to varchar(n). I'd suggest ignoring it on the grounds that the SQL committee have forgotten what they wrote themselves. > (3) The spec suggests that true/false should be upper-cased when > converted to text, so that's what I've implemented, but one could argue > that converting to lower-case would be more consistent with PG's general > approach to case folding. hm, +1 for lower case myself, but not dead set on it. More generally, I'm really hoping to get rid of bespoke text<->whatever cast functions in favor of using datatypes' I/O functions. To what extent can we make the boolean I/O functions serve for this? It seems relatively painless on the input side --- just allow whitespace --- but I suppose we can't change boolout's historical result of "t"/"f" without causing problems. Also, invoking btrim() seems an exceedingly expensive way of ignoring a bit of whitespace. I suppose inefficiency in a seldom-used cast function does not matter, but please don't do it that way in boolin. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] boolean <=> text explicit casts
I noticed that SQL:2003 specifies explicit casts between "boolean" and the character string types. Attached is a patch that implements them, and adds some simple regression tests. A few points worth noting: (1) The SQL spec requires that text::boolean trim leading and trailing whitespace from the input (2) The spec also requires that boolean::varchar(n) should raise an error if "n" is not large enough to accomodate the textual representation of the boolean value. We currently truncate: => select true::boolean::varchar(3); varchar - TRU Not sure offhand if there's an easy way to satisfy the spec's requirement... (3) The spec suggests that true/false should be upper-cased when converted to text, so that's what I've implemented, but one could argue that converting to lower-case would be more consistent with PG's general approach to case folding. -Neil Index: src/backend/utils/adt/bool.c === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/backend/utils/adt/bool.c,v retrieving revision 1.38 diff -p -c -r1.38 bool.c *** src/backend/utils/adt/bool.c 5 Jan 2007 22:19:40 - 1.38 --- src/backend/utils/adt/bool.c 28 May 2007 18:47:59 - *** boolsend(PG_FUNCTION_ARGS) *** 127,132 --- 127,177 PG_RETURN_BYTEA_P(pq_endtypsend(&buf)); } + /* + * textbool - cast function for text => bool + */ + Datum + textbool(PG_FUNCTION_ARGS) + { + text *arg1 = PG_GETARG_TEXT_P(0); + text *trim; + char *trim_str; + + trim = DatumGetTextP(DirectFunctionCall1(btrim1, + PointerGetDatum(arg1))); + trim_str = DatumGetCString(DirectFunctionCall1(textout, + PointerGetDatum(trim))); + + if (pg_strcasecmp(trim_str, "true") == 0) + PG_RETURN_BOOL(true); + else if (pg_strcasecmp(trim_str, "false") == 0) + PG_RETURN_BOOL(false); + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_CHARACTER_VALUE_FOR_CAST), + errmsg("invalid input syntax for type boolean: \"%s\"", + trim_str))); + + PG_RETURN_BOOL(false); /* keep the compiler quiet */ + } + + /* + * booltext - cast function for bool => text + */ + Datum + booltext(PG_FUNCTION_ARGS) + { + bool arg1 = PG_GETARG_BOOL(0); + char *str; + + if (arg1) + str = "TRUE"; + else + str = "FALSE"; + + PG_RETURN_DATUM(DirectFunctionCall1(textin, CStringGetDatum(str))); + } + /* * PUBLIC ROUTINES * Index: src/include/catalog/pg_cast.h === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_cast.h,v retrieving revision 1.32 diff -p -c -r1.32 pg_cast.h *** src/include/catalog/pg_cast.h 2 Apr 2007 03:49:40 - 1.32 --- src/include/catalog/pg_cast.h 28 May 2007 18:41:38 - *** DATA(insert ( 1700 25 1688 i )); *** 302,307 --- 302,309 DATA(insert ( 25 1700 1686 e )); DATA(insert ( 142 25 2922 e )); DATA(insert ( 25 142 2896 e )); + DATA(insert ( 16 25 2971 e )); + DATA(insert ( 25 16 2970 e )); /* * Cross-category casts to and from VARCHAR *** DATA(insert ( 1700 1043 1688 a )); *** 342,347 --- 344,351 DATA(insert ( 1043 1700 1686 e )); DATA(insert ( 142 1043 2922 e )); DATA(insert ( 1043 142 2896 e )); + DATA(insert ( 16 1043 2971 e )); + DATA(insert ( 1043 16 2970 e )); /* * Cross-category casts to and from BPCHAR Index: src/include/catalog/pg_proc.h === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.456 diff -p -c -r1.456 pg_proc.h *** src/include/catalog/pg_proc.h 21 May 2007 17:10:29 - 1.456 --- src/include/catalog/pg_proc.h 28 May 2007 18:40:53 - *** DESCR("List all files in a directory"); *** 3221,3226 --- 3221,3230 DATA(insert OID = 2626 ( pg_sleep PGNSP PGUID 12 1 0 f f t f v 1 2278 "701" _null_ _null_ _null_ pg_sleep - _null_ )); DESCR("Sleep for the specified time in seconds"); + DATA(insert OID = 2970 ( boolean PGNSP PGUID 12 1 0 f f t f i 1 16 "25" _null_ _null_ _null_ textbool - _null_ )); + DESCR("text to boolean"); + DATA(insert OID = 2971 ( textPGNSP PGUID 12 1 0 f f t f i 1 25 "16" _null_ _null_ _null_ booltext - _null_ )); + DESCR("boolean to text"); /* Aggregates (moved here from pg_aggregate for 7.3) */ Index: src/include/utils/builtins.h === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/utils/builtins.h,v retrieving revision 1.293 diff -p -c -r1.293 builtins.h *** src/include/utils/builtins.h 17 May 2007 23:31:49 - 1.293 --- src/include/utils/builtins.h 28 May 2007 18:13:15 - *** extern Datum boolin(PG_FUNCTION_ARGS); *** 70,75 --- 70,77