Re: [HACKERS] patch (for 9.1) string functions ( correct patch attached )

2011-02-17 Thread Bruce Momjian
Erik Rijkers wrote:
 On Thu, July 29, 2010 22:43, Erik Rijkers wrote:
  Hi Pavel,
 
  In xfunc.sgml, I came across a function example (for use of VARIADIC in 
  polymorphic functions),
  where the function name is concat():  (in the manual: 35.4.10. Polymorphic 
  SQL Functions).
  Although that is not strictly wrong, it seems better to change that name 
  when concat goes into
  core, as seems to be the plan.
 
  If you agree, it seems best to include this change in your patch and change 
  that example
  function's name when the stringfunc patch gets applied.
 
 
 My apologies, the previous email had the wrong doc-patch attached.
 
 Here is the correct one.

Applied for 9.1.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-24 Thread Itagaki Takahiro
I applied the attached patch to HEAD. concat(), concat_ws(), left(),
right(), and reverse() are in it, but format() and sprintf() are not.
It's my understanding that we don't have consensus about the best syntax
for the formatting function. We can forget about RAISE. C-like printf
syntax is the next candidate, but we should consider about other ones
restarting with a clean slate.

Anyway, the newly added functions are useful for developers especially
migrated from other database products. Thank you.


On Mon, Aug 23, 2010 at 11:41 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/8/23 Tom Lane t...@sss.pgh.pa.us:
 You should leave RAISE alone and just think about printf.

 ok - then we don't need modify proposed patch. Format function is
 enough for PL/pgSQL and other PL languages has own mutation of this
 functions. There are not barrier for implementation as custom
 function, so we can hold this function most simple.

-- 
Itagaki Takahiro


stringfunc-20100824.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-24 Thread Pavel Stehule
2010/8/24 Itagaki Takahiro itagaki.takah...@gmail.com:
 I applied the attached patch to HEAD. concat(), concat_ws(), left(),
 right(), and reverse() are in it, but format() and sprintf() are not.
 It's my understanding that we don't have consensus about the best syntax
 for the formatting function. We can forget about RAISE. C-like printf
 syntax is the next candidate, but we should consider about other ones
 restarting with a clean slate.


Thank you very much

C-like printf function is the most worse candidate - I don't like to
repeat discussion again - this function is designed for totally
different environment than SQL. I am sure, so we don't need a function
with complex formatting - maintaining to_char is good example.

Regards

Pavel Stehule



 Anyway, the newly added functions are useful for developers especially
 migrated from other database products. Thank you.


 On Mon, Aug 23, 2010 at 11:41 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2010/8/23 Tom Lane t...@sss.pgh.pa.us:
 You should leave RAISE alone and just think about printf.

 ok - then we don't need modify proposed patch. Format function is
 enough for PL/pgSQL and other PL languages has own mutation of this
 functions. There are not barrier for implementation as custom
 function, so we can hold this function most simple.

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-24 Thread Erik Rijkers
On Tue, August 24, 2010 08:32, Itagaki Takahiro wrote:
 I applied the attached patch to HEAD. concat(), concat_ws(), left(),
 right(), and reverse() are in it, but format() and sprintf() are not.

+1 to add also sprintf


Erik Rijkers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-23 Thread Itagaki Takahiro
On Sat, Aug 7, 2010 at 8:39 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I made a discussion page in wiki for the compatibility issue.
 http://wiki.postgresql.org/wiki/String_Functions_and_Operators_Compatibility
 nice, thank you

I filled cells for SQL Server and DB2.

  * concat() is not compatible between MySQL and Oracle/DB2. Which do we buy?
 I prefer a our implementation - it skip a NULL values and it has a
 variadic arguments.

OK. I'm going to put both concat() and concat_ws() into core.

  * How do other databases behave in left() and right() with negative lengths?
 little bit by python substring operations.

I'll respect your proposal. The behaviors for negative lengths will
be our specific feature, but I don't see any problems there.
Since other databases raises errors, user should have negative-protections
in their existing codes.

 I don't agree. This function isn't designed to replace string
 concation. It is designed to build a SQL string (for dynamic SQL) or
 format messages. It isn't designed to replace to_char function. It is
 designed to work mainly inside PLpgSQL functions and then is
 consistent with RAISE statement.

OK. I'll revert my changes to your original format().

But please wait a moment to include sprintf() and contrib/stringfunc.
I think the function is useful, but don't want to have two versions
of formatting functions. So, the extended features will be merged
into format() with additional syntax something like {10s}. Then,
we could simplify the code because some of complex format syntax
are not so useful in SQL, especially length+string formatter (*s).

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-23 Thread Pavel Stehule
2010/8/23 Itagaki Takahiro itagaki.takah...@gmail.com:
 On Sat, Aug 7, 2010 at 8:39 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I made a discussion page in wiki for the compatibility issue.
 http://wiki.postgresql.org/wiki/String_Functions_and_Operators_Compatibility
 nice, thank you

 I filled cells for SQL Server and DB2.

  * concat() is not compatible between MySQL and Oracle/DB2. Which do we buy?
 I prefer a our implementation - it skip a NULL values and it has a
 variadic arguments.

 OK. I'm going to put both concat() and concat_ws() into core.

  * How do other databases behave in left() and right() with negative 
 lengths?
 little bit by python substring operations.

 I'll respect your proposal. The behaviors for negative lengths will
 be our specific feature, but I don't see any problems there.
 Since other databases raises errors, user should have negative-protections
 in their existing codes.

 I don't agree. This function isn't designed to replace string
 concation. It is designed to build a SQL string (for dynamic SQL) or
 format messages. It isn't designed to replace to_char function. It is
 designed to work mainly inside PLpgSQL functions and then is
 consistent with RAISE statement.

 OK. I'll revert my changes to your original format().

 But please wait a moment to include sprintf() and contrib/stringfunc.
 I think the function is useful, but don't want to have two versions
 of formatting functions. So, the extended features will be merged
 into format() with additional syntax something like {10s}. Then,
 we could simplify the code because some of complex format syntax
 are not so useful in SQL, especially length+string formatter (*s).

It's reason, why I moved sprintf to contrib. When you build a SQL
statement or error message, you don't need (usually) complex
formating. And when you need it, then you can use a contrib sprintf
function. I am not against to additional simply formating - but I
afraid so we will create a second printf function.  The formating
enhancing should be shared with RAISE NOTICE command. Probably we can
share code better now, when a PLpgSQL is in core.

Regards

Pavel



 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-23 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 ... The formating
 enhancing should be shared with RAISE NOTICE command.

I remain of the opinion that RAISE's approach to formatting is
completely broken and inextensible, and that any attempt to be somehow
compatible with it is going to lead to an unusably broken design.
You should leave RAISE alone and just think about printf.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-23 Thread Pavel Stehule
2010/8/23 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 ... The formating
 enhancing should be shared with RAISE NOTICE command.

 I remain of the opinion that RAISE's approach to formatting is
 completely broken and inextensible, and that any attempt to be somehow
 compatible with it is going to lead to an unusably broken design.
 You should leave RAISE alone and just think about printf.


ok - then we don't need modify proposed patch. Format function is
enough for PL/pgSQL and other PL languages has own mutation of this
functions. There are not barrier for implementation as custom
function, so we can hold this function most simple.

Regards

Pavel


                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-07 Thread Pavel Stehule
Hello

2010/8/7 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/26 Robert Haas robertmh...@gmail.com:
 Come to think of it, have we checked that the behavior of LEFT, RIGHT,
 REVERSE, etc. is the same on other DBs, especially as far as nulls,
 empty strings, too-large or negative subscripts, etc is concerned?  Is
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?

 I made a discussion page in wiki for the compatibility issue.
 http://wiki.postgresql.org/wiki/String_Functions_and_Operators_Compatibility


nice, thank you

 Please fill empty cells and fix wrong descriptions.
  * concat() is not compatible between MySQL and Oracle/DB2. Which do we buy?

I prefer a our implementation - it skip a NULL values and it has a
variadic arguments. MySQL's concat isn't too consistent - I don't know
why it has different NULL handlidg than concat_ws.

  * How do other databases behave in left() and right() with negative lengths?

I don't know about one with left() and right() functions. What I know,
only MS Access has these functions. The design of these functions is
inspirited by wide used a Oracle library PLvision - this library is
freeware now - but my code is original. See plvstr.left() and
plvstr.right() - and little bit by python substring operations. The
sense of negative arguments is elimination of necessary detoast
operations and utf8 related calculations. For right() it means skip
first n chars, for left() skip last n chars. These functions was
originally designed for contrib - and I still thinking so contrib is
better - My opinion isn't strong here - I prefer a fully functional
function in contrib before minimalistic version in core. Minimalistic
functions are trivial via substring.

  * Are there any databases that has similar features with format() or
 sprintf() ?

I know only about package from PLvision library -

select plvsubst.string('My name is %s %s', ARRAY['Pavel','Stěhule']);

but you can find a lot of custom implementations. I found a some
similar - not exactly this in T-SQL see FORMATMESSAGE() function. But
the using of this function is very limited and it is C API function
(available from T-SQL). It doesn't return a string, just write to log.



 And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 I think we have no other choice but to use VARIADIC any for variadic
 functions.
 We have all combinations of argument types for || operator, (text, text),
 (text, any), (any, text), but we cannot use such codes for variadic functions
 -- they have no limits of argument numbers. And in the case, the functions
 should be STABLE because they convert arguments to text in it with typout
 functions that might be STABLE.


 IMHO, I'd repeat, syntax for format() is a bad choice because it cannot
 concatenate multiple arguments without separator, though RAISE also uses it.
 %s format in sprintf() or {n} syntax in C#'s String.Format() seems to be
 a better design.

I don't agree. This function isn't designed to replace string
concation. It is designed to build a SQL string (for dynamic SQL) or
format messages. It isn't designed to replace to_char function. It is
designed to work mainly inside PLpgSQL functions and then is
consistent with RAISE statement.

Thank you

Regards

Pavel Stehule


 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-08-06 Thread Itagaki Takahiro
2010/7/26 Robert Haas robertmh...@gmail.com:
 Come to think of it, have we checked that the behavior of LEFT, RIGHT,
 REVERSE, etc. is the same on other DBs, especially as far as nulls,
 empty strings, too-large or negative subscripts, etc is concerned?  Is
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?

I made a discussion page in wiki for the compatibility issue.
http://wiki.postgresql.org/wiki/String_Functions_and_Operators_Compatibility

Please fill empty cells and fix wrong descriptions.
  * concat() is not compatible between MySQL and Oracle/DB2. Which do we buy?
  * How do other databases behave in left() and right() with negative lengths?
  * Are there any databases that has similar features with format() or
sprintf() ?


 And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

I think we have no other choice but to use VARIADIC any for variadic
functions.
We have all combinations of argument types for || operator, (text, text),
(text, any), (any, text), but we cannot use such codes for variadic functions
-- they have no limits of argument numbers. And in the case, the functions
should be STABLE because they convert arguments to text in it with typout
functions that might be STABLE.


IMHO, I'd repeat, syntax for format() is a bad choice because it cannot
concatenate multiple arguments without separator, though RAISE also uses it.
%s format in sprintf() or {n} syntax in C#'s String.Format() seems to be
a better design.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions ( correct patch attached )

2010-07-30 Thread Pavel Stehule
Hello

2010/7/29 Erik Rijkers e...@xs4all.nl:
 On Thu, July 29, 2010 22:43, Erik Rijkers wrote:
 Hi Pavel,

 In xfunc.sgml, I came across a function example (for use of VARIADIC in 
 polymorphic functions),
 where the function name is concat():  (in the manual: 35.4.10. Polymorphic 
 SQL Functions).
 Although that is not strictly wrong, it seems better to change that name 
 when concat goes into
 core, as seems to be the plan.

 If you agree, it seems best to include this change in your patch and change 
 that example
 function's name when the stringfunc patch gets applied.


 My apologies, the previous email had the wrong doc-patch attached.

 Here is the correct one.


it is good idea, thank you

Pavel


 Erik Rijkers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-30 Thread Robert Haas
On Mon, Jul 26, 2010 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure mmonc...@gmail.com wrote:
 concat() is not a variadic text function. it is variadic any that
 happens to do text coercion (not casting) inside the function. The
 the assumption that concat is casting internally is probably wrong.
 Suppose I had hacked the int-text cast to call a custom function -- I
 would very much expect concat() not to produce output from that
 function, just the vanilla output text (I could always force the cast
 if I wanted to).

 concat is just a function that does something highly similar to
 casting.  suppose I had a function count_memory(variadic any) that
 summed memory usage of input args -- forcing casts would make no sense
 in that context (I'm not suggesting that you think so -- just bringing
 up a case that illustrates how forcing cast into the function can
 change behavior in subtle ways).

 Right, but I already said I wasn't objecting to the use of variadic
 ANY in cases like that - only in cases where, as here, you were
 basically taking any old arguments and forcing them all to text.

I believe that another unpleasant side effect of this is that CONCAT()
will have to be declared stable rather than immutable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-29 Thread Erik Rijkers
Hi Pavel,

In xfunc.sgml, I came across a function example (for use of VARIADIC in 
polymorphic functions),
where the function name is concat():  (in the manual: 35.4.10. Polymorphic SQL 
Functions). 
Although that is not strictly wrong, it seems better to change that name when 
concat goes into
core, as seems to be the plan.

If you agree, it seems best to include this change in your patch and change 
that example
function's name when the stringfunc patch gets applied.


Erik Rijkers


xfunc.sgml.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions ( correct patch attached )

2010-07-29 Thread Erik Rijkers
On Thu, July 29, 2010 22:43, Erik Rijkers wrote:
 Hi Pavel,

 In xfunc.sgml, I came across a function example (for use of VARIADIC in 
 polymorphic functions),
 where the function name is concat():  (in the manual: 35.4.10. Polymorphic 
 SQL Functions).
 Although that is not strictly wrong, it seems better to change that name when 
 concat goes into
 core, as seems to be the plan.

 If you agree, it seems best to include this change in your patch and change 
 that example
 function's name when the stringfunc patch gets applied.


My apologies, the previous email had the wrong doc-patch attached.

Here is the correct one.


Erik Rijkers


xfunc.sgml.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas robertmh...@gmail.com:
 On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure mmonc...@gmail.com wrote:
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 What does that accomplish, besides forcing you to sprinkle every
 concat call with text casts (maybe that's not a bad thing?)?

 You could ask the same thing about the existing || operator.  And in
 fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
 that was a good decision and perhaps it wasn't, but I don't think
 using CONCAT() to make an end-run around that decision is the way to
 go.

 It was absolutely a good decision because it prevented type inference
 in ways that were ambiguous or surprising (for a canonical case see:
 http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html).

 || operator is still pretty tolerant in the 8.3+ world.
 select interval_col || bool_col; -- error
 select interval_col::text || bool_col; -- text concatenation
 select text_col || interval_col || bool_col; -- text concatenation

 variadic text would require text casts on EVERY non 'unknown' argument
 which drops it below the threshold of usefulness IMO -- it would be
 far stricter than vanilla || concatenation.  Ok, pure bikeshed here
 (shutting my trap now!), but concat() is one of those wonder functions
 that you want to make as usable and terse as possible.  I don't see
 the value in making it overly strict.

 I'm just very skeptical that we should give our functions argument
 types that are essentially fantasy.  CONCAT() doesn't concatenate
 integers or intervals or boxes: it concatenates strings, and only
 strings.  Surely the right fix, if our casting rules are too
 restrictive, is to fix the casting rules; not to start adding a bunch
 of kludgery in every function we define.


Rules are correct probably. The problem is in searching function
algorithm - it is too simple (and fast, just only one cycle). And some
exceptions - like COALESCE and similar are solved specifically on
parser level. We cannot enforce some casting on user level. PostgreSQL
is more strict with timestamps, dates than other databases. Sometimes
you have to do explicit casts, but it clean from context desired
datatype -

SELECT date_trunc('day', current_date) - result is timestamp, but it
is clean, so result have to be date ... When I proposed a parser hook
I though about these functions. With this hook, you can enforce any
necessary casting like some buildin functions does.

so any



















































































































































































































 The problem with your canonical example (and the other examples of
 this type I've seen) is that they are underspecified.  Given two
 identically-named operators, one of which accepts types T1 and T2, and
 the other of which accepts types T3 and T4, what is one to do with T1
 OP T4?  You can cast T1 to T3 and call the first operator or you can
 cast T4 to T2 and call the second one, and it's really not clear which
 is right, so you had better thrown an error.  The same applies to
 functions: given foo(text) and foo(date) and the call
 foo('2010-04-15'), you had better complain, or you may end up with a
 very sad user.  But sometimes our current casting rules require casts
 in situations where they don't seem necessary, such as
 LPAD(integer_column, '0', 5).  That's not ambiguous because there's
 only one definition of LPAD, and the chances that the user will be
 unpleasantly surprised if you call it seem quite low.

 In other words, this problem is not unique to CONCAT().

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule

 so any datatype is last possibility - is workaroud for custom functions.

Probably the most correct implementation of CONCAT have to contains a
parser changes - and then you can use a some internal concat function
with text only parameters. VARIADIC with any is just workaround that
is enouhg

Regards

Pavel






















































































































































































































 The problem with your canonical example (and the other examples of
 this type I've seen) is that they are underspecified.  Given two
 identically-named operators, one of which accepts types T1 and T2, and
 the other of which accepts types T3 and T4, what is one to do with T1
 OP T4?  You can cast T1 to T3 and call the first operator or you can
 cast T4 to T2 and call the second one, and it's really not clear which
 is right, so you had better thrown an error.  The same applies to
 functions: given foo(text) and foo(date) and the call
 foo('2010-04-15'), you had better complain, or you may end up with a
 very sad user.  But sometimes our current casting rules require casts
 in situations where they don't seem necessary, such as
 LPAD(integer_column, '0', 5).  That's not ambiguous because there's
 only one definition of LPAD, and the chances that the user will be
 unpleasantly surprised if you call it seem quite low.

 In other words, this problem is not unique to CONCAT().

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I'm just very skeptical that we should give our functions argument
 types that are essentially fantasy.  CONCAT() doesn't concatenate
 integers or intervals or boxes: it concatenates strings, and only
 strings.  Surely the right fix, if our casting rules are too
 restrictive, is to fix the casting rules; not to start adding a bunch
 of kludgery in every function we define.

 The problem with your canonical example (and the other examples of
 this type I've seen) is that they are underspecified.  Given two
 identically-named operators, one of which accepts types T1 and T2, and
 the other of which accepts types T3 and T4, what is one to do with T1
 OP T4?  You can cast T1 to T3 and call the first operator or you can
 cast T4 to T2 and call the second one, and it's really not clear which
 is right, so you had better thrown an error.  The same applies to
 functions: given foo(text) and foo(date) and the call
 foo('2010-04-15'), you had better complain, or you may end up with a
 very sad user.  But sometimes our current casting rules require casts
 in situations where they don't seem necessary, such as
 LPAD(integer_column, '0', 5).  That's not ambiguous because there's
 only one definition of LPAD, and the chances that the user will be
 unpleasantly surprised if you call it seem quite low.

 In other words, this problem is not unique to CONCAT().

 shoot, can't resist :-).

 Are the casting rules broken? If you want to do lpad w/o casts for
 integers, you can define it explicitly -- feature, not bug.  You can
 basically do this for any function with fixed arguments -- either in
 userland or core.  lpad(int) actually introduces a problem case with
 the minus sign possibly requiring application specific intervention,
 so things are probably correct exactly as they are.

Huh?  If you're arguing that LPAD should require a cast on an integer
argument because the defined behavior of the function might not be
what someone wants, then apparently explicit casts should be required
in all cases.  If you're arguing that I can eliminate the need for an
explicit cast by overloading LPAD(), I agree with you, but that's not
a feature.

 ISTM you are unhappy with the any variadic mechanism in general, not
 the casting rules.

No, I'm unhappy with the use of any to make an end-run around the
casting rules.  If you're writing a function that operates on an
argument of any type, like pg_sizeof() - or if you're writing a
function that does something like append two arrays of unspecified but
identical type or, perhaps, search an array of unspecified type for an
element of that same type - or if you're writing a function where the
types of the arguments can't be known in advance, like printf(), well,
then any is what you need.  But the only argument anyone has put
forward for making CONCAT() accept ANY instead of TEXT is that it
might require casting otherwise.  My response to that is well then
you have to cast it, or fix the casting rules.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas robertmh...@gmail.com:
 On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I'm just very skeptical that we should give our functions argument
 types that are essentially fantasy.  CONCAT() doesn't concatenate
 integers or intervals or boxes: it concatenates strings, and only
 strings.  Surely the right fix, if our casting rules are too
 restrictive, is to fix the casting rules; not to start adding a bunch
 of kludgery in every function we define.

 The problem with your canonical example (and the other examples of
 this type I've seen) is that they are underspecified.  Given two
 identically-named operators, one of which accepts types T1 and T2, and
 the other of which accepts types T3 and T4, what is one to do with T1
 OP T4?  You can cast T1 to T3 and call the first operator or you can
 cast T4 to T2 and call the second one, and it's really not clear which
 is right, so you had better thrown an error.  The same applies to
 functions: given foo(text) and foo(date) and the call
 foo('2010-04-15'), you had better complain, or you may end up with a
 very sad user.  But sometimes our current casting rules require casts
 in situations where they don't seem necessary, such as
 LPAD(integer_column, '0', 5).  That's not ambiguous because there's
 only one definition of LPAD, and the chances that the user will be
 unpleasantly surprised if you call it seem quite low.

 In other words, this problem is not unique to CONCAT().

 shoot, can't resist :-).

 Are the casting rules broken? If you want to do lpad w/o casts for
 integers, you can define it explicitly -- feature, not bug.  You can
 basically do this for any function with fixed arguments -- either in
 userland or core.  lpad(int) actually introduces a problem case with
 the minus sign possibly requiring application specific intervention,
 so things are probably correct exactly as they are.

 Huh?  If you're arguing that LPAD should require a cast on an integer
 argument because the defined behavior of the function might not be
 what someone wants, then apparently explicit casts should be required
 in all cases.  If you're arguing that I can eliminate the need for an
 explicit cast by overloading LPAD(), I agree with you, but that's not
 a feature.

 ISTM you are unhappy with the any variadic mechanism in general, not
 the casting rules.

 No, I'm unhappy with the use of any to make an end-run around the
 casting rules.  If you're writing a function that operates on an
 argument of any type, like pg_sizeof() - or if you're writing a
 function that does something like append two arrays of unspecified but
 identical type or, perhaps, search an array of unspecified type for an
 element of that same type - or if you're writing a function where the
 types of the arguments can't be known in advance, like printf(), well,
 then any is what you need.  But the only argument anyone has put
 forward for making CONCAT() accept ANY instead of TEXT is that it
 might require casting otherwise.  My response to that is well then
 you have to cast it, or fix the casting rules.

I understand, but with only text accepting, then CONCAT will has much
less benefit - you can't do a numeric list, for example

see concat(c1::text, ',', c2::text, ',' ...)

with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
this operator does necessary cast self.

This function is probably one use case of exception from our rules.

Regards

Pavel Stehule

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Merlin Moncure
On Mon, Jul 26, 2010 at 3:07 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I understand, but with only text accepting, then CONCAT will has much
 less benefit - you can't do a numeric list, for example

 see concat(c1::text, ',', c2::text, ',' ...)

 with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
 this operator does necessary cast self.

 This function is probably one use case of exception from our rules.

 At least two, right?

correct: there would be at least two.

 Because for that use case you'd probably want
 concat_ws().  In fact, it's hard for me to think of a variadic text
 function where you wouldn't want the no casts behavior you're
 getting via ANY.

concat() is not a variadic text function. it is variadic any that
happens to do text coercion (not casting) inside the function.  The
the assumption that concat is casting internally is probably wrong.
Suppose I had hacked the int-text cast to call a custom function -- I
would very much expect concat() not to produce output from that
function, just the vanilla output text (I could always force the cast
if I wanted to).

concat is just a function that does something highly similar to
casting.  suppose I had a function count_memory(variadic any) that
summed memory usage of input args -- forcing casts would make no sense
in that context (I'm not suggesting that you think so -- just bringing
up a case that illustrates how forcing cast into the function can
change behavior in subtle ways).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I understand, but with only text accepting, then CONCAT will has much
 less benefit - you can't do a numeric list, for example

 see concat(c1::text, ',', c2::text, ',' ...)

 with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
 this operator does necessary cast self.

 This function is probably one use case of exception from our rules.

At least two, right?  Because for that use case you'd probably want
concat_ws().  In fact, it's hard for me to think of a variadic text
function where you wouldn't want the no casts behavior you're
getting via ANY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure mmonc...@gmail.com wrote:
 concat() is not a variadic text function. it is variadic any that
 happens to do text coercion (not casting) inside the function. The
 the assumption that concat is casting internally is probably wrong.
 Suppose I had hacked the int-text cast to call a custom function -- I
 would very much expect concat() not to produce output from that
 function, just the vanilla output text (I could always force the cast
 if I wanted to).

 concat is just a function that does something highly similar to
 casting.  suppose I had a function count_memory(variadic any) that
 summed memory usage of input args -- forcing casts would make no sense
 in that context (I'm not suggesting that you think so -- just bringing
 up a case that illustrates how forcing cast into the function can
 change behavior in subtle ways).

Right, but I already said I wasn't objecting to the use of variadic
ANY in cases like that - only in cases where, as here, you were
basically taking any old arguments and forcing them all to text.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas robertmh...@gmail.com:
 On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I understand, but with only text accepting, then CONCAT will has much
 less benefit - you can't do a numeric list, for example

 see concat(c1::text, ',', c2::text, ',' ...)

 with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
 this operator does necessary cast self.

 This function is probably one use case of exception from our rules.

 At least two, right?  Because for that use case you'd probably want
 concat_ws().

sorry, yes

Pavel

 In fact, it's hard for me to think of a variadic text
 function where you wouldn't want the no casts behavior you're
 getting via ANY.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Robert Haas
On Sun, Jul 25, 2010 at 11:29 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 I think RAISE is badly designed. Using % as a placeholder has a limitation
 to format strings. For example, format() cannot work as concat():
  SELECT format('%%', 123, 456) = ERROR

It's hard to argue with this, as far as it goes.

 So, my proposal is renaming stringfunc//sprintf() to format(),
 and moving it into the core. I think sprintf() is superior to format()
 in every aspect; '%s%s' works as concat(), and '%s%%' can append
 % without blanks.

So forget about format() and put sprintf() in contrib/stringfunc.
That's not an argument for putting anything in core.  Perhaps such an
argument can be made, but this isn't it.

 Then, concat_ws() will be moved into core because contrib/stringfunc
 only has the function now. In addition, I'd like to include the function for
 the compatibility to MySQL. Also, concat() and concat_ws() can share
 the implementation.

Regardless of where this function ends up, the concat_ws documentation
should contain some mention of the fact that ws is intended to mean
with separator, and that the naming is chosen for compatibility with
MySQL.  As for where to put it, I see no terribly compelling reason
why it needs to be in core.  You can write array_to_string(array[txt1,
txt2, txt3], sep) and get the same effect as concat_ws(sep, txt1,
txt2, txt3).  I don't really want to start maintaining duplicate
functionality for things like this, especially since MySQL users will
no doubt expect that our implementation will be bug-compatible.  If
the output isn't identical to what MySQL does for every set of
arguments, it'll be reported as a bug.

Come to think of it, have we checked that the behavior of LEFT, RIGHT,
REVERSE, etc. is the same on other DBs, especially as far as nulls,
empty strings, too-large or negative subscripts, etc is concerned?  Is
CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
implements here?  And why does CONCAT() take a variadic ANY
argument?  Shouldn't that be variadic TEXT?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Merlin Moncure
On Mon, Jul 26, 2010 at 8:02 AM, Robert Haas robertmh...@gmail.com wrote:
 Regardless of where this function ends up, the concat_ws documentation
 should contain some mention of the fact that ws is intended to mean
 with separator,

big +1 on that -- I've been loosely following the thread and I had
assumed 'ws' meant 'wide string' all this time :-).

 Come to think of it, have we checked that the behavior of LEFT, RIGHT,
 REVERSE, etc. is the same on other DBs, especially as far as nulls,
 empty strings, too-large or negative subscripts, etc is concerned?

Probably 'standard' behavior wrt null would be to be strict; return
null if any argument is null.  The proposed behavior seems ok though.

 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

What does that accomplish, besides forcing you to sprinkle every
concat call with text casts (maybe that's not a bad thing?)?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Pavel Stehule

 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?


CONCAT with variadic text parameter will be limited with existing
default casts to text - for example, you can't to cast date to text,
int to text.

postgres=# create or replace function concat(variadic text[]) returns
text as $$select string_agg(x,'') from unnest($1) x$$ language sql;
CREATE FUNCTION

postgres=# select concat('a','b');
 concat

 ab
(1 row)

Time: 20,812 ms
postgres=# select concat('a',10);
ERROR:  function concat(unknown, integer) does not exist
LINE 1: select concat('a',10);
   ^
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

so with variadic any[] concat doesn't need explicit cats.

Regards

Pavel Stehule

p.s. inside function is every value transformed to text.

 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Robert Haas
On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure mmonc...@gmail.com wrote:
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 What does that accomplish, besides forcing you to sprinkle every
 concat call with text casts (maybe that's not a bad thing?)?

You could ask the same thing about the existing || operator.  And in
fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
that was a good decision and perhaps it wasn't, but I don't think
using CONCAT() to make an end-run around that decision is the way to
go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Merlin Moncure
On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure mmonc...@gmail.com wrote:
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 What does that accomplish, besides forcing you to sprinkle every
 concat call with text casts (maybe that's not a bad thing?)?

 You could ask the same thing about the existing || operator.  And in
 fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
 that was a good decision and perhaps it wasn't, but I don't think
 using CONCAT() to make an end-run around that decision is the way to
 go.

It was absolutely a good decision because it prevented type inference
in ways that were ambiguous or surprising (for a canonical case see:
http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html).

|| operator is still pretty tolerant in the 8.3+ world.
select interval_col || bool_col; -- error
select interval_col::text || bool_col; -- text concatenation
select text_col || interval_col || bool_col; -- text concatenation

variadic text would require text casts on EVERY non 'unknown' argument
which drops it below the threshold of usefulness IMO -- it would be
far stricter than vanilla || concatenation.  Ok, pure bikeshed here
(shutting my trap now!), but concat() is one of those wonder functions
that you want to make as usable and terse as possible.  I don't see
the value in making it overly strict.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Robert Haas
On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure mmonc...@gmail.com wrote:
 CONCAT('foo', NULL) = 'foo' really the behavior that everyone else
 implements here?  And why does CONCAT() take a variadic ANY
 argument?  Shouldn't that be variadic TEXT?

 What does that accomplish, besides forcing you to sprinkle every
 concat call with text casts (maybe that's not a bad thing?)?

 You could ask the same thing about the existing || operator.  And in
 fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
 that was a good decision and perhaps it wasn't, but I don't think
 using CONCAT() to make an end-run around that decision is the way to
 go.

 It was absolutely a good decision because it prevented type inference
 in ways that were ambiguous or surprising (for a canonical case see:
 http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html).

 || operator is still pretty tolerant in the 8.3+ world.
 select interval_col || bool_col; -- error
 select interval_col::text || bool_col; -- text concatenation
 select text_col || interval_col || bool_col; -- text concatenation

 variadic text would require text casts on EVERY non 'unknown' argument
 which drops it below the threshold of usefulness IMO -- it would be
 far stricter than vanilla || concatenation.  Ok, pure bikeshed here
 (shutting my trap now!), but concat() is one of those wonder functions
 that you want to make as usable and terse as possible.  I don't see
 the value in making it overly strict.

I'm just very skeptical that we should give our functions argument
types that are essentially fantasy.  CONCAT() doesn't concatenate
integers or intervals or boxes: it concatenates strings, and only
strings.  Surely the right fix, if our casting rules are too
restrictive, is to fix the casting rules; not to start adding a bunch
of kludgery in every function we define.

The problem with your canonical example (and the other examples of
this type I've seen) is that they are underspecified.  Given two
identically-named operators, one of which accepts types T1 and T2, and
the other of which accepts types T3 and T4, what is one to do with T1
OP T4?  You can cast T1 to T3 and call the first operator or you can
cast T4 to T2 and call the second one, and it's really not clear which
is right, so you had better thrown an error.  The same applies to
functions: given foo(text) and foo(date) and the call
foo('2010-04-15'), you had better complain, or you may end up with a
very sad user.  But sometimes our current casting rules require casts
in situations where they don't seem necessary, such as
LPAD(integer_column, '0', 5).  That's not ambiguous because there's
only one definition of LPAD, and the chances that the user will be
unpleasantly surprised if you call it seem quite low.

In other words, this problem is not unique to CONCAT().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-26 Thread Merlin Moncure
On Mon, Jul 26, 2010 at 11:07 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 It was absolutely a good decision because it prevented type inference
 in ways that were ambiguous or surprising (for a canonical case see:
 http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html).

 || operator is still pretty tolerant in the 8.3+ world.
 select interval_col || bool_col; -- error
 select interval_col::text || bool_col; -- text concatenation
 select text_col || interval_col || bool_col; -- text concatenation

 variadic text would require text casts on EVERY non 'unknown' argument
 which drops it below the threshold of usefulness IMO -- it would be
 far stricter than vanilla || concatenation.  Ok, pure bikeshed here
 (shutting my trap now!), but concat() is one of those wonder functions
 that you want to make as usable and terse as possible.  I don't see
 the value in making it overly strict.

 I'm just very skeptical that we should give our functions argument
 types that are essentially fantasy.  CONCAT() doesn't concatenate
 integers or intervals or boxes: it concatenates strings, and only
 strings.  Surely the right fix, if our casting rules are too
 restrictive, is to fix the casting rules; not to start adding a bunch
 of kludgery in every function we define.

 The problem with your canonical example (and the other examples of
 this type I've seen) is that they are underspecified.  Given two
 identically-named operators, one of which accepts types T1 and T2, and
 the other of which accepts types T3 and T4, what is one to do with T1
 OP T4?  You can cast T1 to T3 and call the first operator or you can
 cast T4 to T2 and call the second one, and it's really not clear which
 is right, so you had better thrown an error.  The same applies to
 functions: given foo(text) and foo(date) and the call
 foo('2010-04-15'), you had better complain, or you may end up with a
 very sad user.  But sometimes our current casting rules require casts
 in situations where they don't seem necessary, such as
 LPAD(integer_column, '0', 5).  That's not ambiguous because there's
 only one definition of LPAD, and the chances that the user will be
 unpleasantly surprised if you call it seem quite low.

 In other words, this problem is not unique to CONCAT().

shoot, can't resist :-).

Are the casting rules broken? If you want to do lpad w/o casts for
integers, you can define it explicitly -- feature, not bug.  You can
basically do this for any function with fixed arguments -- either in
userland or core.  lpad(int) actually introduces a problem case with
the minus sign possibly requiring application specific intervention,
so things are probably correct exactly as they are.  Casting rules
need to be tight because if they are not they can introduce
independent behaviors when you underspecify as you noted above.

ISTM you are unhappy with the any variadic mechanism in general, not
the casting rules.  any essentially means: the types you pass to me
are irrelevant, because i'm going to look up behaviors on the server
and apply them as I see fit.  You've defined a regular methodology
across ALL types and allow the user to pass anything -- I see nothing
at all wrong with this as long as it's only implemented in very
special cases.  If you happen to not like the predefined 'box'
behavior, nothing is stopping you from massaging it before sending it
in.  Also, there's no guarantee that the behavior hidden under the
any can be reproduced via manual cast...concat() is some thing of a
special case where you can.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-25 Thread Itagaki Takahiro
I merged and enhanced some part of your patch:
 - contrib/stringfunc are merged in the core patch
 - Old format() is replaced with sprintf(), but the function name is
still format().
 - Support %q as alias for %iq.

2010/7/25 Pavel Stehule pavel.steh...@gmail.com:
 fixed - it depends on INT64_FORMAT now.
I modified the code a bit not to expect 'll' or 'l'.

 %lq ... literal quoted
 %iq ... ident quoted
I also modified 'q' without specifier, i.e, %q is handled as same as %lq.

 But I found there is a design issue in format() :
 I prefer a current behave - RAISE statement uses same and it is not
 reported as bug for ten years

I think RAISE is badly designed. Using % as a placeholder has a limitation
to format strings. For example, format() cannot work as concat():
  SELECT format('%%', 123, 456) = ERROR

So, my proposal is renaming stringfunc//sprintf() to format(),
and moving it into the core. I think sprintf() is superior to format()
in every aspect; '%s%s' works as concat(), and '%s%%' can append
% without blanks.

Then, concat_ws() will be moved into core because contrib/stringfunc
only has the function now. In addition, I'd like to include the function for
the compatibility to MySQL. Also, concat() and concat_ws() can share
the implementation.

Comments?

-- 
Itagaki Takahiro


stringfunc_core-20100726.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-25 Thread Pavel Stehule
2010/7/26 Itagaki Takahiro itagaki.takah...@gmail.com:
 I merged and enhanced some part of your patch:
  - contrib/stringfunc are merged in the core patch
  - Old format() is replaced with sprintf(), but the function name is
 still format().
  - Support %q as alias for %iq.

 2010/7/25 Pavel Stehule pavel.steh...@gmail.com:
 fixed - it depends on INT64_FORMAT now.
 I modified the code a bit not to expect 'll' or 'l'.

 %lq ... literal quoted
 %iq ... ident quoted
 I also modified 'q' without specifier, i.e, %q is handled as same as %lq.

 But I found there is a design issue in format() :
 I prefer a current behave - RAISE statement uses same and it is not
 reported as bug for ten years

 I think RAISE is badly designed. Using % as a placeholder has a limitation
 to format strings. For example, format() cannot work as concat():
  SELECT format('%%', 123, 456) = ERROR

 So, my proposal is renaming stringfunc//sprintf() to format(),
 and moving it into the core. I think sprintf() is superior to format()
 in every aspect; '%s%s' works as concat(), and '%s%%' can append
 % without blanks.


Sorry, I am strong against. Using a format just for string string
concation is bad idea - there are a concat function, there are ||
operator. Look on complexity of format/RAISE and sprintf. More - it
can be strange, when we have a format function and it is almost
sprintf. I still prefer simplicity of format - you have a true - it
has a issue, but there are simply workaround

format('%', 123||345)

format is very simple - but usually you don't need to specify a with,
a precision.

sprintf has some issue based on common sprintf implementation and
expecting too. For example a precision is used very dynamically - it
has a different sense for integers and for floats, so I wouldn't have
a sprintf in core.

 Then, concat_ws() will be moved into core because contrib/stringfunc
 only has the function now. In addition, I'd like to include the function for
 the compatibility to MySQL. Also, concat() and concat_ws() can share
 the implementation.

 Comments?

I disagree - please, return to prev variant

Regards

Pavel Stehule

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-25 Thread Itagaki Takahiro
2010/7/26 Pavel Stehule pavel.steh...@gmail.com:
 sprintf has some issue based on common sprintf implementation and
 expecting too. For example a precision is used very dynamically - it
 has a different sense for integers and for floats, so I wouldn't have
 a sprintf in core.

Why do we need to have similar functions in core and contrib?
It will just confuse users. If you want to RAISE-version of format(),
I don't want to have stringfunc in contrib.

sprintf() is cool! So, I'd like to use sprintf() by default rather than
format() which has limited features. Almost all users don't know
well about contrib modules. Books about functions in inter-databases
don't consider about postgres' contrib modules. That's why I want to
move the useful features into core rather than contrib modules.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-25 Thread Pavel Stehule
2010/7/26 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/26 Pavel Stehule pavel.steh...@gmail.com:
 sprintf has some issue based on common sprintf implementation and
 expecting too. For example a precision is used very dynamically - it
 has a different sense for integers and for floats, so I wouldn't have
 a sprintf in core.

 Why do we need to have similar functions in core and contrib?
 It will just confuse users. If you want to RAISE-version of format(),
 I don't want to have stringfunc in contrib.


:(

please, look back to discus about this module. There was desided, so
format will be in core and sprintf in contrib. One reason for this
decision was complexity of printf's implementation.

 sprintf() is cool! So, I'd like to use sprintf() by default rather than
 format() which has limited features. Almost all users don't know
 well about contrib modules. Books about functions in inter-databases
 don't consider about postgres' contrib modules. That's why I want to
 move the useful features into core rather than contrib modules.


I have a different opinion and I am not alone. sprintf is good for c
language, but it is problematic in scripting environments, where are
not pointers, where we have more info about variables - where we can
use a introspection - it is like dinosaurus in IT. My implementation
is little bit simple, bacause it is use a buildin functionality - but
still it has more then hundred rows. The full implementation has about
thousand rows. More sprintf is little bit slower than format - it have
to do little bit more work - and it can be confusing for people who
doesn't known it well.

for example - sprintf(%d, 10.2) --- 10.

next - sprintf respect common standard - but this standard doesn't
calculate with PostgreSQL datatypes - there are not support for
date, timestemp for example.

Function format is designed to work with builtin function to_char.
This is simple and full functional combination - I have not a plan to
replace it.

Regards

Pavel Stehule




 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-23 Thread Itagaki Takahiro
I'm reviewing contrib part of the string functions patch.

I found an issue in sprintf() to print integer values. In this case,
'l' (for long type) is used on *all* platforms. For example,
  SELECT sprintf('%d', 10);
internally uses
  appendStringInfo('%ld', (int64) 10)

But there are some platform that requires to use %lld for int64 format, probably
on Windows. That's why we have INT64_FORMAT macro. sprintf() needs to be
adjusted to use INT64_FORMAT or similar portable codes.

Other portion of the patch seems to be OK for me,
unless you have still some idea to extend the feature.

2010/7/17 Pavel Stehule pavel.steh...@gmail.com:
 I have a one idea nonstandard enhancing of sprintf - relatie often job
 is a quoting in PostgreSQL. So sprintf should have a special formats
 for quoted values. What do you think about

 %lq ... literal quoted
 %iq ... ident quoted

They save some keyboard types to write quote_literal() and quote_ident(), right?
They seem to be useful and reasonable for me. One comment is that you might
want to print NULL values as NULL instead of NULL in such cases.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-23 Thread Pavel Stehule
Hello

2010/7/23 Itagaki Takahiro itagaki.takah...@gmail.com:
 I'm reviewing contrib part of the string functions patch.

 I found an issue in sprintf() to print integer values. In this case,
 'l' (for long type) is used on *all* platforms. For example,
  SELECT sprintf('%d', 10);
 internally uses
  appendStringInfo('%ld', (int64) 10)

 But there are some platform that requires to use %lld for int64 format, 
 probably
 on Windows. That's why we have INT64_FORMAT macro. sprintf() needs to be
 adjusted to use INT64_FORMAT or similar portable codes.

ok, I'll look on it


 Other portion of the patch seems to be OK for me,
 unless you have still some idea to extend the feature.

 2010/7/17 Pavel Stehule pavel.steh...@gmail.com:
 I have a one idea nonstandard enhancing of sprintf - relatie often job
 is a quoting in PostgreSQL. So sprintf should have a special formats
 for quoted values. What do you think about

 %lq ... literal quoted
 %iq ... ident quoted

 They save some keyboard types to write quote_literal() and quote_ident(), 
 right?
 They seem to be useful and reasonable for me. One comment is that you might
 want to print NULL values as NULL instead of NULL in such cases.


yes, it is good note

Thank You very much

Regards

Pavel Stehule

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-22 Thread Itagaki Takahiro
2010/7/21 Pavel Stehule pavel.steh...@gmail.com:
 It is about 2% slower for UTF8 encoding. So it isn't significant for me.
 I agree with your changes. Thank You very much

Thanks. The core-part is almost ready to commit.
I'll continue to review the contrib part.

But I found there is a design issue in format() :
Appending a '%' is common use-case, but format() cannot append
% char without any spaces between placeholder and the raw % char.

itagaki=# SELECT format('%%%', 10), format('% %%', 10);
 format | format
+
 %10| 10 %
(1 row)

It is a design issue, and RAISE in PL/pgSQL has the same issue, too.
Do we accept the restriction? Or should we add another escape
syntax and/or placeholder pattern?

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-22 Thread Pavel Stehule
2010/7/23 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/21 Pavel Stehule pavel.steh...@gmail.com:
 It is about 2% slower for UTF8 encoding. So it isn't significant for me.
 I agree with your changes. Thank You very much

 Thanks. The core-part is almost ready to commit.
 I'll continue to review the contrib part.

 But I found there is a design issue in format() :
 Appending a '%' is common use-case, but format() cannot append
 % char without any spaces between placeholder and the raw % char.

 itagaki=# SELECT format('%%%', 10), format('% %%', 10);
  format | format
 +
  %10    | 10 %
 (1 row)

 It is a design issue, and RAISE in PL/pgSQL has the same issue, too.
 Do we accept the restriction? Or should we add another escape
 syntax and/or placeholder pattern?


I prefer a current behave - RAISE statement uses same and it is not
reported as bug for ten years, what I read a mailing lists. I would to
have a FORMAT implementation simple as possible.

and there is simple workaround:

postgres=# create or replace function fx()
returns void as $$
begin
  raise notice '%', '%';
end;
$$ language plpgsql;
CREATE FUNCTION
Time: 560.063 ms
postgres=# select fx();
NOTICE:  %
 fx


(1 row)

Regards

Pavel Stehule

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-21 Thread Pavel Stehule
2010/7/21 Itagaki Takahiro itagaki.takah...@gmail.com:
 I reviewed the core changes of the patch. I don't think we need
 mb_string_info() at all. Instead, we can just call pg_mbxxx() functions.

 I rewrote the patch to use pg_mbstrlen_with_len() and pg_mbcharcliplen().
 What do you think the changes? It requires re-counting lengths of multi-byte
 strings in some cases, but the code will be much simpler and can avoid
 allocating length buffers.


It is a good idea. I see a problem only for right function, where
for most common use case a mblen will be called two times. I am not
able to say now, if this can be a performance issue or not. Highly
probably not - only for very large strings.

postgres=# create or replace function randomstr(int) returns text as
$$select string_agg(substring('abcdefghijklmnop' from
trunc(random()*13)::int+1 for 1),'') from generate_series(1,$1) $$
language sql;
CREATE FUNCTION
Time: 27,452 ms

postgres=# select count(*) from(select right(randomstr(1000),3) from
generate_series(1,1))x;
 count
---
 1
(1 row)

Time: 5615,061 ms
postgres=# select count(*) from(select right(randomstr(1000),3) from
generate_series(1,1))x;
 count
---
 1
(1 row)

Time: 5606,937 ms
postgres=# select count(*) from(select right(randomstr(1000),3) from
generate_series(1,1))x;
 count
---
 1
(1 row)

Time: 5630,771 ms

postgres=# select count(*) from(select right(randomstr(1000),3) from
generate_series(1,1))x;
 count
---
 1
(1 row)

Time: 5753,063 ms
postgres=# select count(*) from(select right(randomstr(1000),3) from
generate_series(1,1))x;
 count
---
 1
(1 row)
Time: 5755,776 ms

It is about 2% slower for UTF8 encoding. So it isn't significant for me.

I agree with your changes. Thank You very much

Regards

Pavel Stehule

 I'd like to apply contrib/stringinfo apart from the core changes,
 because there seems to be still some idea to improve sprintf().

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-20 Thread Itagaki Takahiro
I reviewed the core changes of the patch. I don't think we need
mb_string_info() at all. Instead, we can just call pg_mbxxx() functions.

I rewrote the patch to use pg_mbstrlen_with_len() and pg_mbcharcliplen().
What do you think the changes? It requires re-counting lengths of multi-byte
strings in some cases, but the code will be much simpler and can avoid
allocating length buffers.

I'd like to apply contrib/stringinfo apart from the core changes,
because there seems to be still some idea to improve sprintf().

-- 
Itagaki Takahiro


stringfunc_core-20100721.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-16 Thread Pavel Stehule
Hello

I have a one idea nonstandard enhancing of sprintf - relatie often job
is a quoting in PostgreSQL. So sprintf should have a special formats
for quoted values. What do you think about

%lq ... literal quoted
%iq ... ident quoted

??

Regards

Pavel

2010/7/13 Pavel Stehule pavel.steh...@gmail.com:
 Hello

 2010/7/13 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/13 Pavel Stehule pavel.steh...@gmail.com:
 so this is actualised patch:
 * concat_sql removed
 * left, right, reverse and concat are in core
 * printf and concat_ws are in contrib
 * format show NULL as NULL string
 * removed an using of wide chars

 I think function codes in the core (concat, format, left, right,
 and reverse) are ready for committers. They also have docs, but
 the names are not listed in Index page (bookindex.html).
 Please add
   indexterm
    primaryfuncname/primary
   /indexterm
 in func.sgml for each new function.


 fixed
 However, I have a couple of comments to stringfunc module. sprintf()
 and concat_ws() are not installed by default, but provided by the module.

 todo:
 NULL handling for printf function

 I like NULL for null arguments. It is just same as format() and RAISE.

 done


 === Questions ===
 * concat_ws() transforms NULLs into empty strings.
 Is it an intended behavior and compatible with MySQL?
 Note that string_agg() doesn't add separators to NULLs.


 no I was  wrong - original concat_ws just ignore NULL - fixed, now
 concat_ws has same behave like original.

  =# SELECT coalesce(concat_ws(',', 'A', NULL, 'B'), '(null)');
   coalesce
  --
   A,,B
  (1 row)

 * concat_ws() returns NULL when the separator is NULL.
 Is it an intended behavior and compatible with MySQL?

  =# SELECT coalesce(concat_ws(NULL, 'A', NULL, 'B'), '(null)');
   coalesce
  --
   (null)
  (1 row)

 === Trivial issues ===
 * Some function prototypes are declared but not used.
  We can just remove them.
  - mb_string_info()
  - stringfunc_concat(PG_FUNCTION_ARGS);
  - stringfunc_left(PG_FUNCTION_ARGS);
  - stringfunc_right(PG_FUNCTION_ARGS);
  - stringfunc_reverse(PG_FUNCTION_ARGS);

 * Some error messages need to be improved.
  For example, 1th is wrong.
    =# select sprintf('%*s', NULL, 'abcdef');
    ERROR:  null value not allowed
    HINT:  width (1th) arguments is NULL

 have you a some idea about it?


 * sprintf() has some typos in error messages
  For example, sprinf.


 fixed

 --
 Itagaki Takahiro


 Regards

 Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-13 Thread Itagaki Takahiro
2010/7/13 Pavel Stehule pavel.steh...@gmail.com:
 so this is actualised patch:
 * concat_sql removed
 * left, right, reverse and concat are in core
 * printf and concat_ws are in contrib
 * format show NULL as NULL string
 * removed an using of wide chars

I think function codes in the core (concat, format, left, right,
and reverse) are ready for committers. They also have docs, but
the names are not listed in Index page (bookindex.html).
Please add
   indexterm
primaryfuncname/primary
   /indexterm
in func.sgml for each new function.

However, I have a couple of comments to stringfunc module. sprintf()
and concat_ws() are not installed by default, but provided by the module.

 todo:
 NULL handling for printf function

I like NULL for null arguments. It is just same as format() and RAISE.

=== Questions ===
* concat_ws() transforms NULLs into empty strings.
Is it an intended behavior and compatible with MySQL?
Note that string_agg() doesn't add separators to NULLs.

  =# SELECT coalesce(concat_ws(',', 'A', NULL, 'B'), '(null)');
   coalesce
  --
   A,,B
  (1 row)

* concat_ws() returns NULL when the separator is NULL.
Is it an intended behavior and compatible with MySQL?

  =# SELECT coalesce(concat_ws(NULL, 'A', NULL, 'B'), '(null)');
   coalesce
  --
   (null)
  (1 row)

=== Trivial issues ===
* Some function prototypes are declared but not used.
  We can just remove them.
  - mb_string_info()
  - stringfunc_concat(PG_FUNCTION_ARGS);
  - stringfunc_left(PG_FUNCTION_ARGS);
  - stringfunc_right(PG_FUNCTION_ARGS);
  - stringfunc_reverse(PG_FUNCTION_ARGS);

* Some error messages need to be improved.
  For example, 1th is wrong.
=# select sprintf('%*s', NULL, 'abcdef');
ERROR:  null value not allowed
HINT:  width (1th) arguments is NULL

* sprintf() has some typos in error messages
  For example, sprinf.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-12 Thread Pavel Stehule
Hello

2010/7/12 Robert Haas robertmh...@gmail.com:
 On Sun, Jul 11, 2010 at 10:30 PM, Itagaki Takahiro
 itagaki.takah...@gmail.com wrote:
 2010/7/9 Pavel Stehule pavel.steh...@gmail.com:
 I am sending a actualised patch
 * removed concat_json
 * renamed function rvsr to reverse
 * functions format, sprintf and concat* are stable now (as to_char for 
 example)

 I'd like to move all proposed functions into the core, and not to add
 contrib/stringfunc.
 I think those functions are very useful and worth adding in core.
 * concat(), concat_ws(), reverse(), left() and right() are ready to commit.
 * format() is almost ready, except consensus of NULL representation.

what solution for this moment - be a consistent with RAISE statement ???

 * sprintf() is also useful, but we cannot use swprintf() in it because
  there are many problems in converting to wide chars. We should
  develop mbchar-aware version of %s formatter.

ook I'll work on this - but there is same problem with NULL like a
format function

 * IMHO, concat_sql() has very limited use cases. Boolean  and numeric
  values are not quoted, but still need product-specific conversions because
  some DBs prefer 1/0 instead of true/false.
  Also, dblink_build_sql_insert() provides similar functionality. Will
 we have both?


I can remove it - when I checked it I found so it doesn't well
serialize PostgreSQL specific types as array or record, so I am not
against to remove it now.

 I'm all in favor of putting such things in core as are supported by
 multiple competing products, but is that really true for all of these?


I have not a strong opinion on this - I would to like see reverse and
format in core. But I think, so contrib is enought. Can somebody from
commiters to decide it, please? Any sprintf implemenation needs lots
of code - minimally for this function I prefer contrib for this
function.

Regards

Pavel Stehule


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-12 Thread Robert Haas
On Jul 11, 2010, at 10:32 PM, Itagaki Takahiro itagaki.takah...@gmail.com 
wrote:
 2010/7/12 Robert Haas robertmh...@gmail.com:
 I'm all in favor of putting such things in core as are supported by
 multiple competing products, but is that really true for all of these?
 
 - concat() : MySQL, Oracle, DB2
 - concat_ws() : MySQL,
 - left(), right() : MySQL, SQL Server, DB2
 - reverse() : MySQL, SQL Server, Oracle (as utl_raw.reverse)
 
 concat_sql(), format(), and sprintf() will be our unique features.

I think concat, left, right, and reverse should go into core, and perhaps 
format also.  The rest sound like contrib material to me.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-12 Thread Kevin Grittner
Itagaki Takahiro itagaki.takah...@gmail.com wrote:
 
 I'd like to move all proposed functions into the core, and not to
 add contrib/stringfunc.
 
 Still failed :-(  I used UTF8 database with *locale=C* on 64bit
 Linux.
 char2wchar() doesn't seem to work on C locale. We should avoid
 using the function and converting mb chars to wide chars.
 
   select sprintf('%10s %10d', 'hello', 10);
 ! server closed the connection unexpectedly
 TRAP: FailedAssertion(!(!lc_ctype_is_c()), File: mbutils.c,
 Line: 715)
 
 #0  0x0038c0c332f5 in raise () from /lib64/libc.so.6
 #1  0x0038c0c34b20 in abort () from /lib64/libc.so.6
 #2  0x006e951d in ExceptionalCondition
 (conditionName=value optimized out, errorType=value optimized
 out, fileName=value optimized out, lineNumber=value optimized
 out) at assert.c:57
 #3  0x006fa8bf in char2wchar (to=0x1daf188 L, tolen=16,
 from=0x1da95b0 %*s, fromlen=3) at mbutils.c:715
 #4  0x7f8e8c436d37 in stringfunc_sprintf
 (fcinfo=0x7fff9bdcd4b0)
 at stringfunc.c:463
 
Based on this and subsequent posts, I've changed this patch's status
to Waiting on Author.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-12 Thread Pavel Stehule
2010/7/12 Kevin Grittner kevin.gritt...@wicourts.gov:
 Itagaki Takahiro itagaki.takah...@gmail.com wrote:

 I'd like to move all proposed functions into the core, and not to
 add contrib/stringfunc.

 Still failed :-(  I used UTF8 database with *locale=C* on 64bit
 Linux.
 char2wchar() doesn't seem to work on C locale. We should avoid
 using the function and converting mb chars to wide chars.

   select sprintf('%10s %10d', 'hello', 10);
 ! server closed the connection unexpectedly
 TRAP: FailedAssertion(!(!lc_ctype_is_c()), File: mbutils.c,
 Line: 715)

 #0  0x0038c0c332f5 in raise () from /lib64/libc.so.6
 #1  0x0038c0c34b20 in abort () from /lib64/libc.so.6
 #2  0x006e951d in ExceptionalCondition
 (conditionName=value optimized out, errorType=value optimized
 out, fileName=value optimized out, lineNumber=value optimized
 out) at assert.c:57
 #3  0x006fa8bf in char2wchar (to=0x1daf188 L, tolen=16,
 from=0x1da95b0 %*s, fromlen=3) at mbutils.c:715
 #4  0x7f8e8c436d37 in stringfunc_sprintf
 (fcinfo=0x7fff9bdcd4b0)
 at stringfunc.c:463

 Based on this and subsequent posts, I've changed this patch's status
 to Waiting on Author.

ook - I'll send actualised version tomorrow

Pavel


 -Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-12 Thread Pavel Stehule
Hello

so this is actualised patch:

* concat_sql removed
* left, right, reverse and concat are in core
* printf and concat_ws are in contrib
* format show NULL as NULL string
* removed an using of wide chars

todo:

NULL handling for printf function

Query:
what is corect result for

* printf(%3.2d, NULL) ??
* printf(%3.2s, NULL) ??

Regards

Pavel Stehule


2010/7/12 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/12 Robert Haas robertmh...@gmail.com:
 I'm all in favor of putting such things in core as are supported by
 multiple competing products, but is that really true for all of these?

 - concat() : MySQL, Oracle, DB2
 - concat_ws() : MySQL,
 - left(), right() : MySQL, SQL Server, DB2
 - reverse() : MySQL, SQL Server, Oracle (as utl_raw.reverse)

 concat_sql(), format(), and sprintf() will be our unique features.

 --
 Itagaki Takahiro



stringfunc.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-11 Thread Itagaki Takahiro
2010/7/9 Pavel Stehule pavel.steh...@gmail.com:
 I am sending a actualised patch
 * removed concat_json
 * renamed function rvsr to reverse
 * functions format, sprintf and concat* are stable now (as to_char for 
 example)

I'd like to move all proposed functions into the core, and not to add
contrib/stringfunc.
I think those functions are very useful and worth adding in core.
* concat(), concat_ws(), reverse(), left() and right() are ready to commit.
* format() is almost ready, except consensus of NULL representation.
* sprintf() is also useful, but we cannot use swprintf() in it because
  there are many problems in converting to wide chars. We should
  develop mbchar-aware version of %s formatter.
* IMHO, concat_sql() has very limited use cases. Boolean  and numeric
  values are not quoted, but still need product-specific conversions because
  some DBs prefer 1/0 instead of true/false.
  Also, dblink_build_sql_insert() provides similar functionality. Will
we have both?


 it worked on my station :( - Fedora 64bit
Still failed :-(  I used UTF8 database with *locale=C* on 64bit Linux.
char2wchar() doesn't seem to work on C locale. We should avoid
using the function and converting mb chars to wide chars.

  select sprintf('%10s %10d', 'hello', 10);
! server closed the connection unexpectedly
TRAP: FailedAssertion(!(!lc_ctype_is_c()), File: mbutils.c, Line: 715)

#0  0x0038c0c332f5 in raise () from /lib64/libc.so.6
#1  0x0038c0c34b20 in abort () from /lib64/libc.so.6
#2  0x006e951d in ExceptionalCondition (conditionName=value
optimized out, errorType=value optimized out, fileName=value
optimized out,
lineNumber=value optimized out) at assert.c:57
#3  0x006fa8bf in char2wchar (to=0x1daf188 L, tolen=16,
from=0x1da95b0 %*s, fromlen=3) at mbutils.c:715
#4  0x7f8e8c436d37 in stringfunc_sprintf (fcinfo=0x7fff9bdcd4b0)
at stringfunc.c:463

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-11 Thread Robert Haas
On Sun, Jul 11, 2010 at 10:30 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 2010/7/9 Pavel Stehule pavel.steh...@gmail.com:
 I am sending a actualised patch
 * removed concat_json
 * renamed function rvsr to reverse
 * functions format, sprintf and concat* are stable now (as to_char for 
 example)

 I'd like to move all proposed functions into the core, and not to add
 contrib/stringfunc.
 I think those functions are very useful and worth adding in core.
 * concat(), concat_ws(), reverse(), left() and right() are ready to commit.
 * format() is almost ready, except consensus of NULL representation.
 * sprintf() is also useful, but we cannot use swprintf() in it because
  there are many problems in converting to wide chars. We should
  develop mbchar-aware version of %s formatter.
 * IMHO, concat_sql() has very limited use cases. Boolean  and numeric
  values are not quoted, but still need product-specific conversions because
  some DBs prefer 1/0 instead of true/false.
  Also, dblink_build_sql_insert() provides similar functionality. Will
 we have both?

I'm all in favor of putting such things in core as are supported by
multiple competing products, but is that really true for all of these?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-11 Thread Itagaki Takahiro
2010/7/12 Robert Haas robertmh...@gmail.com:
 I'm all in favor of putting such things in core as are supported by
 multiple competing products, but is that really true for all of these?

- concat() : MySQL, Oracle, DB2
- concat_ws() : MySQL,
- left(), right() : MySQL, SQL Server, DB2
- reverse() : MySQL, SQL Server, Oracle (as utl_raw.reverse)

concat_sql(), format(), and sprintf() will be our unique features.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-10 Thread Erik Rijkers
contrib/stringfunc was missing this small change in contrib/Makefile, I think.  
With it, it
installs and runs make check cleanly.


Erik Rijkers



stringfunc_fix.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-09 Thread Pavel Stehule
hello

2010/7/9 Takahiro Itagaki itagaki.takah...@gmail.com:
 2010/7/8 Pavel Stehule pavel.steh...@gmail.com:
 sorry, attached fixed patch

 Make installcheck for contrib/stringfunc is broken.
 Please run regression test with --enable-cassert build.
  test stringfunc           ... TRAP:
 FailedAssertion(!(!lc_ctype_is_c()), File: mbutils.c, Line: 715)
  LOG:  server process (PID 15121) was terminated by signal 6: Aborted



 This patch contains several functions.
 - format(fmt text, VARIADIC args any)
 - sprintf(fmt text, VARIADIC args any)
 - concat(VARIADIC args any)
 - concat_ws(separator text, VARIADIC args any)
 - concat_json(VARIADIC args any)
 - concat_sql(VARIADIC args any)
 - rvrs(str text)
 - left(str text, n int)
 - right(str text, n int)

 The first one is in the core, and others are in contrib/stringfunc.
 But I think almost
 all of them should be in the core, because users want to write portable SQLs.
 Contrib modules are not always available.  Note that concat() is
 supported by Oracle,
 MySQL, and DB2. Also left() and right() are supported by MySQL, DB2,
 and SQL Server.

 Functions that depend on GUC settings should be marked as VOLATILE
 instead of IMMUTABLE. I think format(), sprintf(), and all of
 concat()s should be
 volatile because at least timestamp depends on datestyle parameter.


ok, I'll fix it

 concat_ws() and rvrs() should be renamed to non-abbreviated forms.
 How about concat_with_sep() and reverse() ?


I used a well known names - concat_ws (MySQL) and rvrs (Oracle rdbms),
I like concat_ws - concat_with_sep is maybe too long. rvrs is too
short, so I'll rename it to reverse - ok?

 I think we should avoid concat_json() at the moment because there is another
 development project for JSON support. The result type will be JOIN type rather
 than text then.


ok

 I'm not sure usefulness of concat_sql(). Why don't you just quote all values
 with quotes and separate them with comma?


concat_xxx functions are helpers to serialisation. So when when you
would to generate INSERT statements for some export, and you cannot
use a COPY statement, you can do

FOR r IN
  SELECT 
LOOP
  RETURN NEXT 'INSERT INTO tab(..) VALUES (' || concat_sql(r.a, r.b, r.c, ... )
END LOOP;
RETURN;

you don't need to solve anything and output is well formated SQL. Some
databases dislike quoted numeric values - and quoted nums can be
sonfusing


 format() function prints NULL as NULL, but RAISE statement in PL/pgSQL
 does as NULL.
 I prefer just NULL.
 maybe some GUC variable
 stringfunc.null_string = 'NULL' in future??

 We have some choices for NULL representation. For example, empty string,
 NULL, NULL, or (null) . What will be our choice?   Each of them looks
 equally reasonable for me. GUC idea is also good because we need to
 mark format() as VOLATILE anyway. We have nothing to lose.


Can ve to solve it other patch? I know to aversion core hackers to new
GUC. Now I propose just NULL. The GUC for NULL representation has
bigger consequences - probably have to related to RAISE statement, and
to proposed functions to_string, to_array.

 ---
 Takahiro Itagaki


Thank You very much, I'do fix it

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-08 Thread Takahiro Itagaki

Pavel Stehule pavel.steh...@gmail.com wrote:

 updated version, concat function doesn't use separator

BTW, didn't you forget stringfunc.sql.in for contrib/stringfunc ?
So, I have not check stringfunc module yet.

I reviewed your patch, and format() in the core is almost ok. It's very cool!
On the other hand, contrib/stringfunc tries to implement safe-sprintf. It's
very complex, and I have questions about multi-byte character handling in it.

* How to print NULL value.
format() function prints NULL as NULL, but RAISE statement in PL/pgSQL
does as NULL. Do we need the same result for them?

postgres=# SELECT format('% vs %', 'NULL', NULL);
format
--
 NULL vs NULL
(1 row)

postgres=# DO $$ BEGIN RAISE NOTICE '% vs %', 'NULL', NULL; END; $$;
NOTICE:  NULL vs NULL
DO

* Error messages: too few/many parameters
  For the same reason, too few/many parameters specified for format()
  might be better for the messages.

  For RAISE in PL/pgSQL:
ERROR:  too few parameters specified for RAISE
ERROR:  too many parameters specified for RAISE

* Why do you need convert multi-byte characters to wide char?
Length specifier in stringfunc_sprintf() means character length.
But is pg_encoding_mbcliplen() enough for the purpose?

* Character-length vs. disp-length in length specifier for sprintf()
For example, '%10s' for sprintf() means 10 characters in the code.
But there might be usages to format text values for display. In such
case, display length might be better for the length specifier.
How about having both s and S?
%10s -- 10 characters
%10S -- 10 disp length; we could use pg_dsplen() for the purpse.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-08 Thread Takahiro Itagaki
2010/7/8 Pavel Stehule pavel.steh...@gmail.com:
 sorry, attached fixed patch

Make installcheck for contrib/stringfunc is broken.
Please run regression test with --enable-cassert build.
  test stringfunc   ... TRAP:
FailedAssertion(!(!lc_ctype_is_c()), File: mbutils.c, Line: 715)
  LOG:  server process (PID 15121) was terminated by signal 6: Aborted

This patch contains several functions.
- format(fmt text, VARIADIC args any)
- sprintf(fmt text, VARIADIC args any)
- concat(VARIADIC args any)
- concat_ws(separator text, VARIADIC args any)
- concat_json(VARIADIC args any)
- concat_sql(VARIADIC args any)
- rvrs(str text)
- left(str text, n int)
- right(str text, n int)

The first one is in the core, and others are in contrib/stringfunc.
But I think almost
all of them should be in the core, because users want to write portable SQLs.
Contrib modules are not always available.  Note that concat() is
supported by Oracle,
MySQL, and DB2. Also left() and right() are supported by MySQL, DB2,
and SQL Server.

Functions that depend on GUC settings should be marked as VOLATILE
instead of IMMUTABLE. I think format(), sprintf(), and all of
concat()s should be
volatile because at least timestamp depends on datestyle parameter.

concat_ws() and rvrs() should be renamed to non-abbreviated forms.
How about concat_with_sep() and reverse() ?

I think we should avoid concat_json() at the moment because there is another
development project for JSON support. The result type will be JOIN type rather
than text then.

I'm not sure usefulness of concat_sql(). Why don't you just quote all values
with quotes and separate them with comma?

 format() function prints NULL as NULL, but RAISE statement in PL/pgSQL
 does as NULL.
 I prefer just NULL.
 maybe some GUC variable
 stringfunc.null_string = 'NULL' in future??

We have some choices for NULL representation. For example, empty string,
NULL, NULL, or (null) . What will be our choice?   Each of them looks
equally reasonable for me. GUC idea is also good because we need to
mark format() as VOLATILE anyway. We have nothing to lose.

---
Takahiro Itagaki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread David E. Wheeler
On Mar 9, 2010, at 6:30 AM, Pavel Stehule wrote:

 this patch contains a string formatting function format
 
 postgres=# select format('some message: % (current user: %)',
 current_date, current_user);
 format
 
 some message: 2010-03-09 (current user: pavel)
 (1 row)
 
 this patch add new contrib module string functions - contains mainly
 sprintf function:

Seems useful. Add it to the CommitFest so it doesn't get lost?

  https://commitfest.postgresql.org/action/commitfest_view?id=6

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Pavel Stehule
2010/3/9 David E. Wheeler da...@kineticode.com:
 On Mar 9, 2010, at 6:30 AM, Pavel Stehule wrote:

 this patch contains a string formatting function format

 postgres=# select format('some message: % (current user: %)',
 current_date, current_user);
                     format
 
 some message: 2010-03-09 (current user: pavel)
 (1 row)

 this patch add new contrib module string functions - contains mainly
 sprintf function:

 Seems useful. Add it to the CommitFest so it doesn't get lost?

  https://commitfest.postgresql.org/action/commitfest_view?id=6


https://commitfest.postgresql.org/action/patch_view?id=287

it is there

Pavel



 Best,

 David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Pavel Stehule
Hello

this patch contains a string formatting function format

postgres=# select format('some message: % (current user: %)',
current_date, current_user);
 format

 some message: 2010-03-09 (current user: pavel)
(1 row)

this patch add new contrib module string functions - contains mainly
sprintf function:

postgres=# select sprintf('some message: %10s (%10s)', current_date,
current_user);
sprintf
---
 some message: 2010-03-09 ( pavel)
(1 row)

postgres=# select sprintf('some message: %10s (%-10s)', current_date,
current_user);
sprintf
---
 some message: 2010-03-09 (pavel )
(1 row)


some string variadic functions

postgres=# select concat('ahaha',10,null,current_date, true);
 concat

 ahaha,10,,2010-03-09,t
(1 row)

postgres=# select concat_sql('ahaha',10,null,current_date, true);
   concat_sql

 'ahaha',10,NULL,'2010-03-09',t
(1 row)

postgres=# select concat_json('ahaha'::text,10,null,current_date, true);
concat_json
---
 ahaha,10,null,2010-03-09,true
(1 row)


and some basic text function rvrs, left, right.

Regards
Pavel Stehule


stringfunc.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Yeb Havinga

Pavel Stehule wrote:

Hello

this patch contains a string formatting function format
  

Hi Pavel,

This is supercool. Haven't tried it out yet but surely will, thanks!

Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Merlin Moncure
On Tue, Mar 9, 2010 at 9:30 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 postgres=# select concat('ahaha',10,null,current_date, true);
         concat
 
  ahaha,10,,2010-03-09,t

why are there commas in the output?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Pavel Stehule
2010/3/9 Merlin Moncure mmonc...@gmail.com:
 On Tue, Mar 9, 2010 at 9:30 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 postgres=# select concat('ahaha',10,null,current_date, true);
         concat
 
  ahaha,10,,2010-03-09,t

 why are there commas in the output?


I though so comma can be default separator - but I see - it is wrong -
separator have to be empty string.

Pavel


 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Pavel Stehule
updated version, concat function doesn't use separator

Pavel

2010/3/9 Pavel Stehule pavel.steh...@gmail.com:
 2010/3/9 Merlin Moncure mmonc...@gmail.com:
 On Tue, Mar 9, 2010 at 9:30 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 postgres=# select concat('ahaha',10,null,current_date, true);
         concat
 
  ahaha,10,,2010-03-09,t

 why are there commas in the output?


 I though so comma can be default separator - but I see - it is wrong -
 separator have to be empty string.

 Pavel


 merlin




stringfunc.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Merlin Moncure
On Tue, Mar 9, 2010 at 1:45 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 updated version, concat function doesn't use separator

btw...very cool stuff.  I took a brief look at the sprintf
implementation.  The main switch:
switch (pdesc.field_type)

It looks like format codes we choose not to support (like %p) are
silently ignored.  Is this the correct behavior?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-03-09 Thread Pavel Stehule
2010/3/9 Merlin Moncure mmonc...@gmail.com:
 On Tue, Mar 9, 2010 at 1:45 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 updated version, concat function doesn't use separator

 btw...very cool stuff.  I took a brief look at the sprintf
 implementation.  The main switch:
 switch (pdesc.field_type)

 It looks like format codes we choose not to support (like %p) are
 silently ignored.  Is this the correct behavior?

it could be enhanced. sprintf is little bit baroque function and I am
not able to take all details. Please, add comment with proposals for
change.

Pavel


 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers