Re: [HACKERS] Bug in to_timestamp().

2017-04-08 Thread David Steele
On 2/27/17 4:19 AM, Artur Zakirov wrote:
> On 15.02.2017 15:26, Amul Sul wrote:
>>
>> The new status of this patch is: Ready for Committer
>>
> 
> Thank you for your review!

This submission has been moved to CF 2017-07.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2017-02-27 Thread Artur Zakirov

On 15.02.2017 15:26, Amul Sul wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Review of v7 patch:
- Patch applies to the top of master HEAD cleanly & make check-world also fine.
- Tom Lane's review comments are fixed.



The new status of this patch is: Ready for Committer



Thank you for your review!

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2017-02-15 Thread Amul Sul
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Review of v7 patch: 
- Patch applies to the top of master HEAD cleanly & make check-world also fine.
- Tom Lane's review comments are fixed.



The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2017-01-31 Thread Michael Paquier
On Mon, Dec 5, 2016 at 11:35 AM, Haribabu Kommi
 wrote:
> Moved to next CF with "needs review" status.

Same, this time to CF 2017-03.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-12-04 Thread Haribabu Kommi
On Mon, Nov 7, 2016 at 2:26 AM, Artur Zakirov 
wrote:

>
> Hm, truly. Fixed it.
>
> I've attached the fixed patch.
>

Moved to next CF with "needs review" status.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Bug in to_timestamp().

2016-11-06 Thread Artur Zakirov
Thank you for your comments,

2016-11-04 20:36 GMT+02:00 Tom Lane :
>
> Artur Zakirov  writes:
> > I attached new version of the patch, which fix is_char_separator()
> > declaration too.
>
> I did some experimenting using
> http://rextester.com/l/oracle_online_compiler
>

>
> which makes it look a lot like Oracle treats separator characters in the
> pattern the same as spaces (but I haven't checked their documentation to
> confirm that).
>
> The proposed patch doesn't seem to me to be trying to follow
> these Oracle behaviors, but I think there is very little reason for
> changing any of this stuff unless we move it closer to Oracle.

Previous versions of the patch doesn't try to follow all Oracle
behaviors. It tries to fix Amul Sul's behaviors. Because I was
confused by dealing with spaces and separators by Oracle
to_timestamp() and there is not information about it in the Oracle
documentation:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#g195443

I've thought better about it now and fixed the patch. Now parser
removes spaces after and before fields and insists that count of
separators in the input string should match count of spaces or
separators in the formatting string (but in formatting string we can
have more separators than in input string).

>
> Some other nitpicking:
>
> * I think the is-separator function would be better coded like
>
> static bool
> is_separator_char(const char *str)
> {
> /* ASCII printable character, but not letter or digit */
> return (*str > 0x20 && *str < 0x7F &&
> !(*str >= 'A' && *str <= 'Z') &&
> !(*str >= 'a' && *str <= 'z') &&
> !(*str >= '0' && *str <= '9'));
> }
>
> The previous way is neither readable nor remarkably efficient, and it
> knows much more about the ASCII character set than it needs to.

Fixed.

>
> * Don't forget the cast to unsigned char when using isspace() or other
>  functions.

Fixed.

>
> * I do not see the reason for throwing an error here:
>
> +   /* Previous character was a backslash */
> +   if (in_backslash)
> +   {
> +   /* After backslash should go non-space character */
> +   if (isspace(*str))
> +   ereport(ERROR,
> +   
> (errcode(ERRCODE_SYNTAX_ERROR),
> +errmsg("invalid escape 
> sequence")));
> +   in_backslash = false;
>
> Why shouldn't backslash-space be a valid quoting sequence?
>

Hm, truly. Fixed it.

I've attached the fixed patch.

--
Sincerely,
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2e64cc4..5a4e248 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6159,7 +6159,8 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
to_timestamp and to_date
skip multiple blank spaces in the input string unless the
FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   to_timestamp('2000JUN', ' MON') and
+   to_timestamp('2000JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
@@ -6169,6 +6170,19 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 
  
   
+   to_timestamp and to_date don't
+   skip multiple printable non letter and non digit characters in the input
+   string, but skip them in the formatting string. For example,
+   to_timestamp('2000-JUN', '/MON') and
+   to_timestamp('2000/JUN', '//MON') work, but
+   to_timestamp('2000//JUN', '/MON')
+   returns an error because count of the "/" character in the input string
+   doesn't match count of it in the formatting string.
+  
+ 
+
+ 
+  
Ordinary text is allowed in to_char
templates and will be output literally.  You can put a substring
in double quotes to force it to be interpreted as literal text
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index d4eaa50..d28ceec 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -951,6 +953,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(const char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(const char *str, const KeySuffix *suf, int type);
+static bool is_separator_char(const char 

Re: [HACKERS] Bug in to_timestamp().

2016-11-04 Thread Tom Lane
Artur Zakirov  writes:
> I attached new version of the patch, which fix is_char_separator()
> declaration too.

I did some experimenting using
http://rextester.com/l/oracle_online_compiler

It appears that Oracle will consider a single space in the pattern
to match zero or more spaces in the input, as all of these produce
the expected result:

SELECT to_timestamp('2000JUN', ' MON') FROM dual
SELECT to_timestamp('2000 JUN', ' MON') FROM dual
SELECT to_timestamp('2000   JUN', ' MON') FROM dual

Also, a space in the pattern will match a single separator character
in the input, but not multiple separators:

SELECT to_timestamp('2000-JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000--JUN', ' MON') FROM dual
ORA-01843: not a valid month

And you can have whitespace along with that single separator:

SELECT to_timestamp('2000 + JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000 +   JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('2000 ++  JUN', ' MON') FROM dual
ORA-01843: not a valid month

You can have leading whitespace, but not leading separators:

SELECT to_timestamp('   2000 JUN', ' MON') FROM dual
-- ok
SELECT to_timestamp('/2000 JUN', ' MON') FROM dual
ORA-01841: (full) year must be between -4713 and +, and not be 0

These all work:

SELECT to_timestamp('2000 JUN', '/MON') FROM dual
SELECT to_timestamp('2000JUN', '/MON') FROM dual
SELECT to_timestamp('2000/JUN', '/MON') FROM dual
SELECT to_timestamp('2000-JUN', '/MON') FROM dual

but not

SELECT to_timestamp('2000//JUN', '/MON') FROM dual
ORA-01843: not a valid month
SELECT to_timestamp('2000--JUN', '/MON') FROM dual
ORA-01843: not a valid month

which makes it look a lot like Oracle treats separator characters in the
pattern the same as spaces (but I haven't checked their documentation to
confirm that).

The proposed patch doesn't seem to me to be trying to follow 
these Oracle behaviors, but I think there is very little reason for
changing any of this stuff unless we move it closer to Oracle.

Some other nitpicking:

* I think the is-separator function would be better coded like

static bool
is_separator_char(const char *str)
{
/* ASCII printable character, but not letter or digit */
return (*str > 0x20 && *str < 0x7F &&
!(*str >= 'A' && *str <= 'Z') &&
!(*str >= 'a' && *str <= 'z') &&
!(*str >= '0' && *str <= '9'));
}

The previous way is neither readable nor remarkably efficient, and it
knows much more about the ASCII character set than it needs to.

* Don't forget the cast to unsigned char when using isspace() or other
 functions.

* I do not see the reason for throwing an error here:

+   /* Previous character was a backslash */
+   if (in_backslash)
+   {
+   /* After backslash should go non-space character */
+   if (isspace(*str))
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("invalid escape 
sequence")));
+   in_backslash = false;

Why shouldn't backslash-space be a valid quoting sequence?

I'll set this back to Waiting on Author.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-10-02 Thread Michael Paquier
On Fri, Sep 30, 2016 at 12:34 PM, Amul Sul  wrote:
> The new status of this patch is: Ready for Committer

And moved to next CF with same status.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-29 Thread Amul Sul
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Appreciate your support. 
I've tested v6 patch again, looks good to me, code in v6 does not differ much 
from v4 patch. Ready for committer review. Thanks !

Regards,
Amul Sul

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-29 Thread Tom Lane
Robert Haas  writes:
> On Thu, Sep 29, 2016 at 4:54 AM, amul sul  wrote:
>> Commitfest status left untouched, I am not sure how to deal with
>> "Returned With Feedback" patch. Is there any chance that, this can be
>> considered again for committer review?

> You may be able to set the status back to "Ready for Committer", or
> else you can move the entry to the next CommitFest and do it there.

I already switched it back to "Needs Review".  AFAIK none of the statuses
are immovable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-29 Thread Robert Haas
On Thu, Sep 29, 2016 at 4:54 AM, amul sul  wrote:
> On Thu, Sep 29, 2016 at 2:48 AM, Tom Lane  wrote:
>> Artur Zakirov  writes:
>>> - now DCH_cache_getnew() is called after parse_format(). Because now
>>> parse_format() can raise an error and in the next attempt
>>> DCH_cache_search() could return broken cache entry.
>>
>> I started looking at your 0001-to-timestamp-format-checking-v4.patch
>> and this point immediately jumped out at me.  Currently the code relies
>> ... without any documentation ... on no elog being thrown out of
>> parse_format().  That's at the very least trouble waiting to happen.
>> There's a hack to deal with errors from within the NUMDesc_prepare
>> subroutine, but it's a pretty ugly and underdocumented hack.  And what
>> you had here was randomly different from that solution, too.
>>
>> After a bit of thought it seemed to me that a much cleaner fix is to add
>> a "valid" flag to the cache entries, which we can leave clear until we
>> have finished parsing the new format string.  That avoids adding extra
>> data copying as you suggested, removes the need for PG_TRY, and just
>> generally seems cleaner and more bulletproof.
>>
>> I've pushed a patch that does it that way.  The 0001 patch will need
>> to be rebased over that (might just require removal of some hunks,
>> not sure).
>>
>> I also pushed 0002-to-timestamp-validation-v2.patch with some revisions
>> (it'd broken acceptance of BC dates, among other things, but I think
>> I fixed everything).
>>
>> Since you told us earlier that you'd be on vacation through the end of
>> September, I'm assuming that nothing more will happen on this patch during
>> this commitfest, so I will mark the CF entry Returned With Feedback.
>
> Behalf of Artur I've rebased patch, removed hunk dealing with broken
> cache entries by copying it, which is no longer required after 83bed06
> commit.
>
> Commitfest status left untouched, I am not sure how to deal with
> "Returned With Feedback" patch. Is there any chance that, this can be
> considered again for committer review?

You may be able to set the status back to "Ready for Committer", or
else you can move the entry to the next CommitFest and do it there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-29 Thread Artur Zakirov
2016-09-29 13:54 GMT+05:00 amul sul :
>
> On Thu, Sep 29, 2016 at 2:48 AM, Tom Lane  wrote:
> >
> > I started looking at your 0001-to-timestamp-format-checking-v4.patch
> > and this point immediately jumped out at me.  Currently the code relies
> > ... without any documentation ... on no elog being thrown out of
> > parse_format().  That's at the very least trouble waiting to happen.
> > There's a hack to deal with errors from within the NUMDesc_prepare
> > subroutine, but it's a pretty ugly and underdocumented hack.  And what
> > you had here was randomly different from that solution, too.
> >
> > After a bit of thought it seemed to me that a much cleaner fix is to add
> > a "valid" flag to the cache entries, which we can leave clear until we
> > have finished parsing the new format string.  That avoids adding extra
> > data copying as you suggested, removes the need for PG_TRY, and just
> > generally seems cleaner and more bulletproof.
> >
> > I've pushed a patch that does it that way.  The 0001 patch will need
> > to be rebased over that (might just require removal of some hunks,
> > not sure).
> >
> > I also pushed 0002-to-timestamp-validation-v2.patch with some revisions
> > (it'd broken acceptance of BC dates, among other things, but I think
> > I fixed everything).

Thank you for committing the 0002 part of the patch! I wanted to fix
cache functions too, but wasn't sure about that.

> >
> > Since you told us earlier that you'd be on vacation through the end of
> > September, I'm assuming that nothing more will happen on this patch during
> > this commitfest, so I will mark the CF entry Returned With Feedback.
>
> Behalf of Artur I've rebased patch, removed hunk dealing with broken
> cache entries by copying it, which is no longer required after 83bed06
> commit.
>
> Commitfest status left untouched, I am not sure how to deal with
> "Returned With Feedback" patch. Is there any chance that, this can be
> considered again for committer review?

Thank you for fixing the patch!
Today I have access to the internet and able to fix and test the
patch. I've looked at your 0001-to-timestamp-format-checking-v5.patch.
It seems nice to me. I suppose it is necessary to fix
is_char_separator() declaration.

from:
static bool is_char_separator(char *str);

to:
static bool is_char_separator(const char *str);

Because now in parse_format() *str argument is const.
I attached new version of the patch, which fix is_char_separator()
declaration too.

Sorry for confusing!

-- 
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


0001-to-timestamp-format-checking-v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-29 Thread amul sul
On Thu, Sep 29, 2016 at 2:48 AM, Tom Lane  wrote:
> Artur Zakirov  writes:
>> - now DCH_cache_getnew() is called after parse_format(). Because now
>> parse_format() can raise an error and in the next attempt
>> DCH_cache_search() could return broken cache entry.
>
> I started looking at your 0001-to-timestamp-format-checking-v4.patch
> and this point immediately jumped out at me.  Currently the code relies
> ... without any documentation ... on no elog being thrown out of
> parse_format().  That's at the very least trouble waiting to happen.
> There's a hack to deal with errors from within the NUMDesc_prepare
> subroutine, but it's a pretty ugly and underdocumented hack.  And what
> you had here was randomly different from that solution, too.
>
> After a bit of thought it seemed to me that a much cleaner fix is to add
> a "valid" flag to the cache entries, which we can leave clear until we
> have finished parsing the new format string.  That avoids adding extra
> data copying as you suggested, removes the need for PG_TRY, and just
> generally seems cleaner and more bulletproof.
>
> I've pushed a patch that does it that way.  The 0001 patch will need
> to be rebased over that (might just require removal of some hunks,
> not sure).
>
> I also pushed 0002-to-timestamp-validation-v2.patch with some revisions
> (it'd broken acceptance of BC dates, among other things, but I think
> I fixed everything).
>
> Since you told us earlier that you'd be on vacation through the end of
> September, I'm assuming that nothing more will happen on this patch during
> this commitfest, so I will mark the CF entry Returned With Feedback.

Behalf of Artur I've rebased patch, removed hunk dealing with broken
cache entries by copying it, which is no longer required after 83bed06
commit.

Commitfest status left untouched, I am not sure how to deal with
"Returned With Feedback" patch. Is there any chance that, this can be
considered again for committer review?

Thanks !

Regards,
Amul


0001-to-timestamp-format-checking-v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-28 Thread Tom Lane
Artur Zakirov  writes:
> - now DCH_cache_getnew() is called after parse_format(). Because now 
> parse_format() can raise an error and in the next attempt 
> DCH_cache_search() could return broken cache entry.

I started looking at your 0001-to-timestamp-format-checking-v4.patch
and this point immediately jumped out at me.  Currently the code relies
... without any documentation ... on no elog being thrown out of
parse_format().  That's at the very least trouble waiting to happen.
There's a hack to deal with errors from within the NUMDesc_prepare
subroutine, but it's a pretty ugly and underdocumented hack.  And what
you had here was randomly different from that solution, too.

After a bit of thought it seemed to me that a much cleaner fix is to add
a "valid" flag to the cache entries, which we can leave clear until we
have finished parsing the new format string.  That avoids adding extra
data copying as you suggested, removes the need for PG_TRY, and just
generally seems cleaner and more bulletproof.

I've pushed a patch that does it that way.  The 0001 patch will need
to be rebased over that (might just require removal of some hunks,
not sure).

I also pushed 0002-to-timestamp-validation-v2.patch with some revisions
(it'd broken acceptance of BC dates, among other things, but I think
I fixed everything).

Since you told us earlier that you'd be on vacation through the end of
September, I'm assuming that nothing more will happen on this patch during
this commitfest, so I will mark the CF entry Returned With Feedback.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-28 Thread amul sul
On Fri, Sep 16, 2016 at 10:01 PM, Artur Zakirov
 wrote:
> On 25.08.2016 13:26, amul sul wrote:
>>>
>>> Thanks. I've created the entry in
>>> https://commitfest.postgresql.org/10/713/
>>> . You can add yourself as a reviewer.
>>>
>>
>> Done, added myself as reviewer & changed status to "Ready for
>> Committer". Thanks !
>>
>> Regards,
>> Amul Sul
>>
>>
>
> It seems there is no need to rebase patches. But I will not be able to fix
> patches for two weeks because of travel if someone will have a note or
> remark.
>
> So it would be nice if someone will be able to fix possible issues for above
> reasone.

Sure, Ill do that, if required.

Regards,
Amul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-09-16 Thread Artur Zakirov

On 25.08.2016 13:26, amul sul wrote:

Thanks. I've created the entry in https://commitfest.postgresql.org/10/713/
. You can add yourself as a reviewer.



Done, added myself as reviewer & changed status to "Ready for
Committer". Thanks !

Regards,
Amul Sul




It seems there is no need to rebase patches. But I will not be able to 
fix patches for two weeks because of travel if someone will have a note 
or remark.


So it would be nice if someone will be able to fix possible issues for 
above reasone.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-25 Thread amul sul
On Thu, Aug 25, 2016 at 3:41 PM, Artur Zakirov  wrote:
>>> You are right. I assigned to prev_type NODE_TYPE_SPACE to be able to
>>> execute such query:
>>>
>>>
>>> SELECT to_timestamp('---2000JUN', ' MON');
>>>
>>>
>>> Will be it a proper behaviour?
>>
>>
>>
>> Looks good to me, no one will complain if something working on PG but not
>> on Oracle.
>
>
> Thanks. I've created the entry in https://commitfest.postgresql.org/10/713/
> . You can add yourself as a reviewer.
>

Done, added myself as reviewer & changed status to "Ready for
Committer". Thanks !

Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-25 Thread Artur Zakirov

You are right. I assigned to prev_type NODE_TYPE_SPACE to be able to
execute such query:


SELECT to_timestamp('---2000JUN', ' MON');


Will be it a proper behaviour?



Looks good to me, no one will complain if something working on PG but not on 
Oracle.


Thanks. I've created the entry in 
https://commitfest.postgresql.org/10/713/ . You can add yourself as a 
reviewer.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-25 Thread amul sul
On Thursday, August 25, 2016 1:56 PM, Artur Zakirov  
wrote:
>> #2. Warning at compilation;
>>
>> formatting.c: In function ‘do_to_timestamp’:
>> formatting.c:3049:37: warning: ‘prev_type’ may be used uninitialized in this 
>> function [-Wmaybe-uninitialized]
>> if (prev_type == NODE_TYPE_SPACE || prev_type == NODE_TYPE_SEPARATOR)
>> ^
>> formatting.c:2988:5: note: ‘prev_type’ was declared here
>> prev_type;
>> ^
>>
>> You can avoid this by assigning  zero (or introduce NODE_TYPE_INVAL ) to 
>> prev_type at following line:
>>
>> 256 +   prev_type;
>
>
>You are right. I assigned to prev_type NODE_TYPE_SPACE to be able to 
>execute such query:
>
>
>SELECT to_timestamp('---2000JUN', ' MON');
>
>
>Will be it a proper behaviour?


Looks good to me, no one will complain if something working on PG but not on 
Oracle. 


Thanks & Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-25 Thread Artur Zakirov

Hi,


#1.
Whitespace @ line # 317.


Sorry, fixed.


#2. Warning at compilation;

formatting.c: In function ‘do_to_timestamp’:
formatting.c:3049:37: warning: ‘prev_type’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
if (prev_type == NODE_TYPE_SPACE || prev_type == NODE_TYPE_SEPARATOR)
^
formatting.c:2988:5: note: ‘prev_type’ was declared here
prev_type;
^

You can avoid this by assigning  zero (or introduce NODE_TYPE_INVAL ) to 
prev_type at following line:

256 +   prev_type;


You are right. I assigned to prev_type NODE_TYPE_SPACE to be able to 
execute such query:


SELECT to_timestamp('---2000JUN', ' MON');

Will be it a proper behaviour?

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5c1c4f6..36d8b3e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6146,9 +6146,12 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
+   skip multiple blank spaces and printable non letter and non digit
+   characters in the input string and in the formatting string unless the
FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   to_timestamp('2000JUN', ' MON'),
+   to_timestamp('2000JUN', ' MON')
+   and to_timestamp('2000JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..7430013 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -947,6 +949,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(char *str, const KeySuffix *suf, int type);
+static bool is_char_separator(char *str);
 static void NUMDesc_prepare(NUMDesc *num, FormatNode *n);
 static void parse_format(FormatNode *node, char *str, const KeyWord *kw,
 			 const KeySuffix *suf, const int *index, int ver, NUMDesc *Num);
@@ -963,7 +966,6 @@ static void dump_node(FormatNode *node, int max);
 static const char *get_th(char *num, int type);
 static char *str_numth(char *dest, char *num, int type);
 static int	adjust_partial_year_to_2020(int year);
-static int	strspace_len(char *str);
 static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
 static void from_char_set_int(int *dest, const int value, const FormatNode *node);
 static int	from_char_parse_int_len(int *dest, char **src, const int len, FormatNode *node);
@@ -1036,6 +1038,17 @@ suff_search(char *str, const KeySuffix *suf, int type)
 	return NULL;
 }
 
+static bool
+is_char_separator(char *str)
+{
+	return ((pg_mblen(str) == 1) &&
+			/* printable character, but not letter and digit */
+			((*str >= '!' && *str <= '/') ||
+			 (*str >= ':' && *str <= '@') ||
+			 (*str >= '[' && *str <= '`') ||
+			 (*str >= '{' && *str <= '~')));
+}
+
 /* --
  * Prepare NUMDesc (number description struct) via FormatNode struct
  * --
@@ -1237,9 +1250,10 @@ parse_format(FormatNode *node, char *str, const KeyWord *kw,
 {
 	const KeySuffix *s;
 	FormatNode *n;
+	bool		in_text = false,
+in_backslash = false;
 	int			node_set = 0,
-suffix,
-last = 0;
+suffix;
 
 #ifdef DEBUG_TO_FROM_CHAR
 	elog(DEBUG_elog_output, "to_char/number(): run parser");
@@ -1251,6 +1265,55 @@ parse_format(FormatNode *node, char *str, const KeyWord *kw,
 	{
 		suffix = 0;
 
+		/* Previous character was a backslash */
+		if (in_backslash)
+		{
+			/* After backslash should go non-space character */
+			if (isspace(*str))
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg("invalid escape sequence")));
+			in_backslash = false;
+
+			n->type = NODE_TYPE_CHAR;
+			n->character = *str;
+			n->key = NULL;
+			n->suffix = 0;
+			n++;
+			str++;
+			continue;
+		}
+		/* Previous character was a quote */
+		else if (in_text)
+		{
+			if (*str == '"')
+			{
+str++;
+in_text = false;
+			}
+			else if (*str == '\\')
+			{
+str++;
+in_backslash = true;
+			}
+			else
+			{
+if (ver == DCH_TYPE && is_char_separator(str))
+	n->type = NODE_TYPE_SEPARATOR;
+else if (isspace(*str))
+	n->type = NODE_TYPE_SPACE;
+else
+	n->type = NODE_TYPE_CHAR;
+
+n->character = *str;
+n->key = NULL;
+n->suffix = 0;
+n++;
+str++;
+			}
+			continue;
+		}
+
 		/*
 		 

Re: [HACKERS] Bug in to_timestamp().

2016-08-24 Thread amul sul
Hi Artur Zakirov,

0001-to-timestamp-format-checking-v3.patch looks pretty reasonable to me, other 
than following concern:

#1.
Whitespace @ line # 317.

#2. Warning at compilation;

formatting.c: In function ‘do_to_timestamp’:
formatting.c:3049:37: warning: ‘prev_type’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
if (prev_type == NODE_TYPE_SPACE || prev_type == NODE_TYPE_SEPARATOR)
^
formatting.c:2988:5: note: ‘prev_type’ was declared here
prev_type;
^

You can avoid this by assigning  zero (or introduce NODE_TYPE_INVAL ) to 
prev_type at following line: 

256 +   prev_type;



Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-24 Thread Artur Zakirov
Sorry. I did not get last two mails from Amul. Don't know why. So I 
reply to another mail.



Documented as working case, but unfortunatly it does not :

postgres=# SELECT to_timestamp('2000JUN', ' MON');
ERROR:  invalid value "---" for "MON"
DETAIL:  The given value did not match any of the allowed values for this field.


Indeed! Fixed it. Now this query executes without error. Added this case 
to tests.



NODE_TYPE_CHAR assumption in else block seems incorrect. What if we have space 
after double quote?  It will again have incorrect output without any error, see 
below:

postgres=# SELECT to_timestamp('Year: 1976, Month: May, Day: 16',
postgres(# '"Year:" , "Month:" FMMonth, "Day:"   DD');
to_timestamp
--
0006-05-16 00:00:00-07:52:58
(1 row)

I guess, we might need NODE_TYPE_SEPARATOR and NODE_TYPE_SPACE check as well?


Agree. Fixed and added to tests.


Unnecessary hunk?
We should not touch any code unless it necessary to implement proposed feature, 
otherwise it will add unnecessary diff to the patch and eventually extra burden 
to review the code. Similarly hunk in the patch @ line # 313 - 410, nothing to 
do with to_timestamp behaviour improvement, IIUC.

If you think this changes need to be in, please submit separate cleanup-patch.


Fixed it. It was a typo.
About lines # 313 - 410. It is necessary to avoid bugs. I wrote aboud it 
in 
https://www.postgresql.org/message-id/b2a39359-3282-b402-f4a3-057aae500ee7%40postgrespro.ru 
:



- now DCH_cache_getnew() is called after parse_format(). Because now
parse_format() can raise an error and in the next attempt
DCH_cache_search() could return broken cache entry.


For example, you can test incorrect inputs for to_timestamp(). Try to 
execute such query several times.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6355300..0fe50e1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6146,9 +6146,12 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
+   skip multiple blank spaces and printable non letter and non digit
+   characters in the input string and in the formatting string unless the
FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   to_timestamp('2000JUN', ' MON'),
+   to_timestamp('2000JUN', ' MON')
+   and to_timestamp('2000JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..a3dbcaf 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -947,6 +949,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(char *str, const KeySuffix *suf, int type);
+static bool is_char_separator(char *str);
 static void NUMDesc_prepare(NUMDesc *num, FormatNode *n);
 static void parse_format(FormatNode *node, char *str, const KeyWord *kw,
 			 const KeySuffix *suf, const int *index, int ver, NUMDesc *Num);
@@ -963,7 +966,6 @@ static void dump_node(FormatNode *node, int max);
 static const char *get_th(char *num, int type);
 static char *str_numth(char *dest, char *num, int type);
 static int	adjust_partial_year_to_2020(int year);
-static int	strspace_len(char *str);
 static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
 static void from_char_set_int(int *dest, const int value, const FormatNode *node);
 static int	from_char_parse_int_len(int *dest, char **src, const int len, FormatNode *node);
@@ -1036,6 +1038,17 @@ suff_search(char *str, const KeySuffix *suf, int type)
 	return NULL;
 }
 
+static bool
+is_char_separator(char *str)
+{
+	return ((pg_mblen(str) == 1) &&
+			/* printable character, but not letter and digit */
+			((*str >= '!' && *str <= '/') ||
+			 (*str >= ':' && *str <= '@') ||
+			 (*str >= '[' && *str <= '`') ||
+			 (*str >= '{' && *str <= '~')));
+}
+
 /* --
  * Prepare NUMDesc (number description struct) via FormatNode struct
  * --
@@ -1237,9 +1250,10 @@ parse_format(FormatNode *node, char *str, const KeyWord *kw,
 {
 	const KeySuffix *s;
 	FormatNode *n;
+	bool		in_text = false,
+in_backslash = false;
 	int			node_set = 0,
-suffix,
-last = 0;
+suffix;
 
 #ifdef DEBUG_TO_FROM_CHAR
 	

Re: [HACKERS] Bug in to_timestamp().

2016-08-22 Thread amul sul
Hi Artur Zakirov,

Please see following review comment for 
"0001-to-timestamp-format-checking-v2.patch" & share your thought: 

#1.

15 +   to_timestamp('2000JUN', ' MON')

Documented as working case, but unfortunatly it does not : 

postgres=# SELECT to_timestamp('2000JUN', ' MON');
ERROR:  invalid value "---" for "MON"
DETAIL:  The given value did not match any of the allowed values for this field.


#2.

102 +   /* Previous character was a quote */
103 +   else if (in_text)
104 +   {
105 +   if (*str == '"')
106 +   {
107 +   str++;
108 +   in_text = false;
109 +   }
110 +   else if (*str == '\\')
111 +   {
112 +   str++;
113 +   in_backslash = true;
114 +   }
115 +   else
116 +   {
117 +   n->type = NODE_TYPE_CHAR;
118 +   n->character = *str;
119 +   n->key = NULL;
120 +   n->suffix = 0;
121 +   n++;
122 +   str++;
123 +   }
124 +   continue;
125 +   }
126 +

NODE_TYPE_CHAR assumption in else block seems incorrect. What if we have space 
after double quote?  It will again have incorrect output without any error, see 
below: 

postgres=# SELECT to_timestamp('Year: 1976, Month: May, Day: 16', 
postgres(# '"Year:" , "Month:" FMMonth, "Day:"   DD');
to_timestamp 
--
0006-05-16 00:00:00-07:52:58
(1 row)

I guess, we might need NODE_TYPE_SEPARATOR and NODE_TYPE_SPACE check as well? 



#3.

296 -   /* Ignore spaces before fields when not in FX (fixed width) mode */
297 +   /* Ignore spaces before fields when not in FX (fixed * width) mode 
*/
298 if (!fx_mode && n->key->id != DCH_FX)
299 {
300 -   while (*s != '\0' && isspace((unsigned char) *s))
301 +   while (*s != '\0' && (isspace((unsigned char) *s)))
302 s++;

Unnecessary hunk?
We should not touch any code unless it necessary to implement proposed feature, 
otherwise it will add unnecessary diff to the patch and eventually extra burden 
to review the code. Similarly hunk in the patch @ line # 313 - 410, nothing to 
do with to_timestamp behaviour improvement, IIUC.

If you think this changes need to be in, please submit separate cleanup-patch.


>I attached second patch "0002-to-timestamp-validation-v2.patch". With it 
>PostgreSQL perform additional checks for date and time. But as I wrote 
>there is another patch in the thread "to_date_valid()" wich differs from 
>this patch.

@community : I am not sure what to do with this patch, should we keep it as 
separate enhancement?

Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-18 Thread amul sul

On Friday, August 19, 2016 12:42 AM, Robert Haas  wrote:
>On Wed, Aug 17, 2016 at 10:35 AM, amul sul  wrote:
>
>
>> Hmm. I haven't really looked into the code, but with applying both patches 
>> it looks precisely imitate Oracle's behaviour. Thanks.
>
>
>This is good to hear, but for us to consider applying something like
>this, somebody would need to look into the code pretty deeply.

Sure Robert, Im planning to start initial review until next week at the 
earliest. Thanks


Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-18 Thread Robert Haas
On Wed, Aug 17, 2016 at 10:35 AM, amul sul  wrote:
> Hmm. I haven't really looked into the code, but with applying both patches it 
> looks precisely imitate Oracle's behaviour. Thanks.

This is good to hear, but for us to consider applying something like
this, somebody would need to look into the code pretty deeply.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-17 Thread amul sul
On Wednesday, August 17, 2016 5:15 PM, Artur Zakirov  
wrote:
>I attached new patch "0001-to-timestamp-format-checking-v2.patch". It 
>fixes behaviour for Amul's scenarious:

>
Great.
>
>> Following are few scenarios where we break existing behaviour:
>>
>> SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS');
>> SELECT TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');
>> SELECT TO_TIMESTAMP('2011*03*18 23^38&15', '-MM-DD HH24:MI:SS');
>> SELECT TO_TIMESTAMP('2011*03!18 #%23^38$15', '-MM-DD$$$HH24:MI:SS');
>>
>> But current patch behaviour is not that much bad either at least we have 
>> errors, but I am not sure about community acceptance.
>>
>> I would like to divert communities' attention on following case:
>> SELECT TO_TIMESTAMP('2013--10-01', '-MM-DD');
>
>
>For queries above the patch gives an output without any error.
>
>
>> Another is, shouldn’t we have error in following cases?
>> SELECT TO_TIMESTAMP('2016-06-13 99:99:99', '-MM-DD HH24:MI:SS');
>> SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD HH24:MI:SS');
>
>
>I attached second patch "0002-to-timestamp-validation-v2.patch". With it 
>PostgreSQL perform additional checks for date and time. But as I wrote 
>there is another patch in the thread "to_date_valid()" wich differs from 
>this patch.

>

Hmm. I haven't really looked into the code, but with applying both patches it 
looks precisely imitate Oracle's behaviour. Thanks.

Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-17 Thread Artur Zakirov
I attached new patch "0001-to-timestamp-format-checking-v2.patch". It 
fixes behaviour for Amul's scenarious:



Following are few scenarios where we break existing behaviour:

SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS');
SELECT TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');
SELECT TO_TIMESTAMP('2011*03*18 23^38&15', '-MM-DD HH24:MI:SS');
SELECT TO_TIMESTAMP('2011*03!18 #%23^38$15', '-MM-DD$$$HH24:MI:SS');

But current patch behaviour is not that much bad either at least we have 
errors, but I am not sure about community acceptance.

I would like to divert communities' attention on following case:
SELECT TO_TIMESTAMP('2013--10-01', '-MM-DD');


For queries above the patch gives an output without any error.


Another is, shouldn’t we have error in following cases?
SELECT TO_TIMESTAMP('2016-06-13 99:99:99', '-MM-DD HH24:MI:SS');
SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD HH24:MI:SS');


I attached second patch "0002-to-timestamp-validation-v2.patch". With it 
PostgreSQL perform additional checks for date and time. But as I wrote 
there is another patch in the thread "to_date_valid()" wich differs from 
this patch.


Sincerely,
--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7830334..559c55f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6083,9 +6083,12 @@ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
+   skip multiple blank spaces and printable non letter and non digit
+   characters in the input string and in the formatting string unless the
FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   to_timestamp('2000JUN', ' MON'),
+   to_timestamp('2000JUN', ' MON')
+   and to_timestamp('2000JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..b14678d 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -947,6 +949,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(char *str, const KeySuffix *suf, int type);
+static bool is_char_separator(char *str);
 static void NUMDesc_prepare(NUMDesc *num, FormatNode *n);
 static void parse_format(FormatNode *node, char *str, const KeyWord *kw,
 			 const KeySuffix *suf, const int *index, int ver, NUMDesc *Num);
@@ -963,7 +966,6 @@ static void dump_node(FormatNode *node, int max);
 static const char *get_th(char *num, int type);
 static char *str_numth(char *dest, char *num, int type);
 static int	adjust_partial_year_to_2020(int year);
-static int	strspace_len(char *str);
 static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
 static void from_char_set_int(int *dest, const int value, const FormatNode *node);
 static int	from_char_parse_int_len(int *dest, char **src, const int len, FormatNode *node);
@@ -1036,6 +1038,17 @@ suff_search(char *str, const KeySuffix *suf, int type)
 	return NULL;
 }
 
+static bool
+is_char_separator(char *str)
+{
+	return ((pg_mblen(str) == 1) &&
+			/* printable character, but not letter and digit */
+			((*str >= '!' && *str <= '/') ||
+			 (*str >= ':' && *str <= '@') ||
+			 (*str >= '[' && *str <= '`') ||
+			 (*str >= '{' && *str <= '~')));
+}
+
 /* --
  * Prepare NUMDesc (number description struct) via FormatNode struct
  * --
@@ -1237,9 +1250,10 @@ parse_format(FormatNode *node, char *str, const KeyWord *kw,
 {
 	const KeySuffix *s;
 	FormatNode *n;
+	bool		in_text = false,
+in_backslash = false;
 	int			node_set = 0,
-suffix,
-last = 0;
+suffix;
 
 #ifdef DEBUG_TO_FROM_CHAR
 	elog(DEBUG_elog_output, "to_char/number(): run parser");
@@ -1251,6 +1265,49 @@ parse_format(FormatNode *node, char *str, const KeyWord *kw,
 	{
 		suffix = 0;
 
+		/* Previous character was a backslash */
+		if (in_backslash)
+		{
+			/* After backslash should go non-space character */
+			if (isspace(*str))
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg("invalid escape sequence")));
+			in_backslash = false;
+
+			n->type = NODE_TYPE_CHAR;
+			n->character = *str;
+			n->key = NULL;
+			n->suffix = 0;
+			n++;
+			str++;
+			continue;
+		}
+		/* Previous character was 

Re: [HACKERS] Bug in to_timestamp().

2016-08-16 Thread Artur Zakirov

On 15.08.2016 19:28, Robert Haas wrote:


Well, what's the Oracle behavior in any of these cases?  I don't think
we can decide to change any of this behavior without knowing that.  If
a proposed behavior change is incompatible with our previous releases,
I think it'd better at least be more compatible with Oracle.
Otherwise, we're just changing from an established behavior that we
invented ourselves to a new behavior we invented ourselves, which is
only worthwhile if it's absolutely clear that the new behavior is way
better.



1 - Oracle's output for first queries is:

-> SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS') 
FROM dual;


TO_TIMESTAMP('2015-12-3113:43:36','MMDDHH24MISS')
---
31-DEC-15 01.43.36.0 PM

-> SELECT TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS') 
FROM dual;


TO_TIMESTAMP('2011$03!1823_38_15','-MM-DDHH24:MI:SS')
---
18-MAR-11 11.38.15.0 PM

-> SELECT TO_TIMESTAMP('2011*03!18 #%23^38$15', 
'-MM-DD$$$HH24:MI:SS') FROM dual;


TO_TIMESTAMP('2011*03!18#%23^38$15','-MM-DD$$$HH24:MI:SS')
---
18-MAR-11 11.38.15.0 PM

PostgreSQL with the patch gives "ERROR:  expected space character in 
given string". I will fix this.



2 - Oracle's output for query with hyphen is:

-> SELECT TO_TIMESTAMP('2013--10-01', '-MM-DD') FROM dual;
SELECT TO_TIMESTAMP('2013--10-01', '-MM-DD') FROM dual
*
ERROR at line 1:
ORA-01843: not a valid month

Here PostgreSQL with the patch does not give an error. So I will fix 
this too.



3 - The last two queries give an error. This patch do not handle such 
queries intentionally, because there is the thread 
https://www.postgresql.org/message-id/57786490.9010201%40wars-nicht.de . 
That thread concerns to_date() function. But it should concerns 
to_timestamp() also. So I suppose that should be a different patch for 
this last case.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-15 Thread amul sul
Monday, 15 August 2016 9:58 PM, Robert Haas  wrote:

>On Mon, Aug 15, 2016 at 10:56 AM, amul sul  wrote:
>> On Thursday, 11 August 2016 3:18 PM, Artur Zakirov
>>  wrote:


[Skipped..]
>Well, what's the Oracle behavior in any of these cases?  I don't think
>we can decide to change any of this behavior without knowing that.  If
>a proposed behavior change is incompatible with our previous releases,
>I think it'd better at least be more compatible with Oracle.
>Otherwise, we're just changing from an established behavior that we
>invented ourselves to a new behavior we invented ourselves, which is
>only worthwhile if it's absolutely clear that the new behavior is way
>better.

Following cases works as expected on Oracle (except 3rd one asking input value 
for &15).

>> SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS');
>> SELECT TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');
>> SELECT TO_TIMESTAMP('2011*03*18 23^38&15', '-MM-DD HH24:MI:SS');
>> SELECT TO_TIMESTAMP('2011*03!18 #%23^38$15', '-MM-DD$$$HH24:MI:SS');


And rest throwing error as shown below:

>> SELECT TO_TIMESTAMP('2016-06-13 99:99:99', '-MM-DD HH24:MI:SS');
ERROR at line 1:
ORA-01850: hour must be between 0 and 23

>> SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD HH24:MI:SS');
ERROR at line 1:
ORA-01839: date not valid for month specified


>(Also, note that text formatted email is generally preferred to HTML
>on this mailing list; the fact that your email is in a different font
>than the rest of the thread makes it hard to read.)

Understood. Will try to follow this, thanks.


Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-15 Thread Robert Haas
On Mon, Aug 15, 2016 at 10:56 AM, amul sul  wrote:
> On Thursday, 11 August 2016 3:18 PM, Artur Zakirov
>  wrote:
>
>>Here is my patch. It is a proof of concept.
>>Date/Time Formatting
>>
>>There are changes in date/time formatting rules:
> -> now to_timestamp() and to_date() skip spaces in the input string and
>>in the formatting string unless FX option is used, as Amul Sul wrote on
>>first message of this thread. But Ex.2 gives an error now with this
>>patch (should we fix this too?).
>
> Why not, currently we are skipping whitespace exists at the start of input
> string but not if in format string.
>
> [Skipped… ]
>
>>Of course this patch can be completely wrong. But it tries to introduce
>>more formal rules for formatting.
>>I will be grateful for notes and remarks.
>
> Following are few scenarios where we break existing behaviour:
>
> SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS');
> SELECT TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');
> SELECT TO_TIMESTAMP('2011*03*18 23^38&15', '-MM-DD HH24:MI:SS');
> SELECT TO_TIMESTAMP('2011*03!18 #%23^38$15', '-MM-DD$$$HH24:MI:SS');
>
> But current patch behaviour is not that much bad either at least we have
> errors, but I am not sure about community acceptance.
>
> I would like to divert communities' attention on following case:
> SELECT TO_TIMESTAMP('2013--10-01', '-MM-DD');
>
> Where the hyphen (-) is not skipped. So ultimately -10 is interpreted using
> MM as negative 10. So the date goes back by that many months (and probably
> additional days because of -31), and so the final output becomes 2012-01-30.
> But the fix is not specific to hyphen case. Ideally the fix would have been
> to handle it in from_char_parse_int(). Here, -10 is converted to int using
> strtol. May be we could have done it using strtoul(). Is there any intention
> behind not considering signed integers versus unsigned ones ?
>
> Another is, shouldn’t we have error in following cases?
> SELECT TO_TIMESTAMP('2016-06-13 99:99:99', '-MM-DD HH24:MI:SS');
> SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD HH24:MI:SS');

Well, what's the Oracle behavior in any of these cases?  I don't think
we can decide to change any of this behavior without knowing that.  If
a proposed behavior change is incompatible with our previous releases,
I think it'd better at least be more compatible with Oracle.
Otherwise, we're just changing from an established behavior that we
invented ourselves to a new behavior we invented ourselves, which is
only worthwhile if it's absolutely clear that the new behavior is way
better.

(Also, note that text formatted email is generally preferred to HTML
on this mailing list; the fact that your email is in a different font
than the rest of the thread makes it hard to read.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-08-15 Thread amul sul
On Thursday, 11 August 2016 3:18 PM, Artur Zakirov  
wrote:
>Here is my patch. It is a proof of concept.>Date/Time 
>Formatting>>There are changes in date/time formatting 
>rules:-> now to_timestamp() and to_date() skip spaces in the input string and 
>>in the formatting string unless FX option is used, as Amul Sul wrote on 
>>first message of this thread. But Ex.2 gives an error now with this >patch 
>(should we fix this too?).
Why not, currently we are skipping whitespace exists at the start of input 
string but not if in format string.
[Skipped… ]
>Of course this patch can be completely wrong. But it tries to introduce >more 
>formal rules for formatting.>I will be grateful for notes and remarks.
Following are few scenarios where we break existing behaviour:
SELECT TO_TIMESTAMP('2015-12-31 13:43:36', ' MM DD HH24 MI SS');SELECT 
TO_TIMESTAMP('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');SELECT 
TO_TIMESTAMP('2011*03*18 23^38&15', '-MM-DD HH24:MI:SS');SELECT 
TO_TIMESTAMP('2011*03!18 #%23^38$15', '-MM-DD$$$HH24:MI:SS');
But current patch behaviour is not that much bad either at least we have 
errors, but I am not sure about community acceptance.
I would like to divert communities' attention on following case:SELECT 
TO_TIMESTAMP('2013--10-01', '-MM-DD');
Where the hyphen (-) is not skipped. So ultimately -10 is interpreted using MM 
as negative 10. So the date goes back by that many months (and probably 
additional days because of -31), and so the final output becomes 2012-01-30. 
But the fix is not specific to hyphen case. Ideally the fix would have been to 
handle it in from_char_parse_int(). Here, -10 is converted to int using strtol. 
May be we could have done it using strtoul(). Is there any intention behind not 
considering signed integers versus unsigned ones ?
Another is, shouldn’t we have error in following cases? SELECT 
TO_TIMESTAMP('2016-06-13 99:99:99', '-MM-DD HH24:MI:SS'); SELECT 
TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD HH24:MI:SS');

Thanks  & Regards,Amul Sul


Re: [HACKERS] Bug in to_timestamp().

2016-08-11 Thread Artur Zakirov

Hello,

On 14.07.2016 12:16, Pavel Stehule wrote:


last point was discussed in thread related to to_date_valid function.

Regards

Pavel


Thank you.

Here is my patch. It is a proof of concept.

Date/Time Formatting


There are changes in date/time formatting rules:

- now to_timestamp() and to_date() skip spaces in the input string and 
in the formatting string unless FX option is used, as Amul Sul wrote on 
first message of this thread. But Ex.2 gives an error now with this 
patch (should we fix this too?).


- in the code space characters and separator characters have different 
types of FormatNode. Separator characters are characters ',', '-', '.', 
'/' and ':'. This is done to have different rules of formatting to space 
and separator characters.
If FX option isn't used then PostgreSQL do not insist that separator in 
the formatting string should match separator in the formatting string. 
But count of separators should be equal with or without FX option.


- now PostgreSQL check is there a closing quote. Otherwise the error is 
raised.


Still PostgreSQL do not insist that text character in the formatting 
string should match text character in the input string. It is not 
obvious if this should be fixed. Because we may have different character 
case or character with accent mark or without accent mark.
But I suppose that it is not right just check text character count. For 
example, there is unicode version of space character U+00A0.


Code changes


- new defines:

#define NODE_TYPE_SEPARATOR 4
#define NODE_TYPE_SPACE 5

- now DCH_cache_getnew() is called after parse_format(). Because now 
parse_format() can raise an error and in the next attempt 
DCH_cache_search() could return broken cache entry.



This patch do not handle all noticed issues in this thread, since still 
there is not consensus about them. So this patch in a proof of concept 
status and it can be changed.


Of course this patch can be completely wrong. But it tries to introduce 
more formal rules for formatting.


I will be grateful for notes and remarks.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7830334..5656245 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6083,9 +6083,10 @@ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
-   FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   skip multiple blank spaces in the input string and in the formatting
+   string unless the FX option is used. For example,
+   to_timestamp('2000JUN', ' MON')
+   and to_timestamp('2000JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
@@ -6121,6 +6122,19 @@ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
skips two input characters.
   
  
+ 
+ 
+  
+   In to_date and to_timestamp separator
+   characters ',', '-',
+   '.', '/' and ':'
+   outside of double-quoted strings skip the number of input characters
+   contained in the string unless the FX option is used.
+   If FX option is specified then consumed separator
+   character in the formatting string must match the separator character
+   in the input string.
+  
+ 
 
  
   
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..ef39df9 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -947,6 +949,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(char *str, const KeySuffix *suf, int type);
+static bool is_char_separator(char *str);
 static void NUMDesc_prepare(NUMDesc *num, FormatNode *n);
 static void parse_format(FormatNode *node, char *str, const KeyWord *kw,
 			 const KeySuffix *suf, const int *index, int ver, NUMDesc *Num);
@@ -963,7 +966,6 @@ static void dump_node(FormatNode *node, int max);
 static const char *get_th(char *num, int type);
 static char *str_numth(char *dest, char *num, int type);
 static int	adjust_partial_year_to_2020(int year);
-static int	strspace_len(char *str);
 static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
 static void from_char_set_int(int *dest, const int value, const FormatNode *node);
 static int	

Re: [HACKERS] Bug in to_timestamp().

2016-07-14 Thread Pavel Stehule
2016-07-14 11:05 GMT+02:00 Artur Zakirov :

> On 23.06.2016 21:02, Tom Lane wrote:
>
>> Robert Haas  writes:
>>
>>> On Thu, Jun 23, 2016 at 1:40 PM, Tom Lane  wrote:
>>>
 At the very least I'd want to see a thought-through proposal that
 addresses all three of these interrelated points:

 * what should a space in the format match
 * what should a non-space, non-format-code character in the format match
 * how should we handle fields that are not exactly the width suggested
 by the format

>>>
>> I'm not averse to some further study of those issues, and I think the
>>> first two are closely related.  The third one strikes me as a somewhat
>>> separate consideration that doesn't need to be addressed by the same
>>> patch.
>>>
>>
>> If you think those issues are not interrelated, you have not thought
>> about it carefully enough.
>>
>> As an example, what we can do to handle not-expected-width fields is
>> very different if the format is "DDMMYY" versus if it is "DD-MM-YY".
>> In the first case we have little choice but to believe that each
>> field is exactly two digits wide.  In the second case, depending on
>> how we decide to define matching of "-", we might be able to allow
>> the field widths to vary so that they're effectively "whatever is
>> between the dashes".  But that would require insisting that "-"
>> match a "-", or at least a non-alphanumeric, which is not how it
>> behaves today.
>>
>> I don't want to twiddle these behaviors in 9.6 and then again next year.
>>
>> regards, tom lane
>>
>>
>>
> Hi,
>
> I want to start work on this patch.
>
> As a conclusion:
> - need a decision about three questions:
>
>
>> * what should a space in the format match
>> * what should a non-space, non-format-code character in the format match
>> * how should we handle fields that are not exactly the width suggested
>> by the format
>>
>
> - nobody wants solve this issue in 9.6.
>
> And I have question: what about wrong input in date argument? For example,
> from Alex's message:
>
> postgres=# SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD
>> HH24:MI:SS');
>>to_timestamp
>> 
>>   2016-03-01 15:43:36+03
>> (1 row)
>>
>
> Here '2016-02-30' is wrong date. I didn't see any conclusion about this
> case in the thread.
>

last point was discussed in thread related to to_date_valid function.

Regards

Pavel


>
> --
> Artur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Bug in to_timestamp().

2016-07-14 Thread Artur Zakirov

On 23.06.2016 21:02, Tom Lane wrote:

Robert Haas  writes:

On Thu, Jun 23, 2016 at 1:40 PM, Tom Lane  wrote:

At the very least I'd want to see a thought-through proposal that
addresses all three of these interrelated points:

* what should a space in the format match
* what should a non-space, non-format-code character in the format match
* how should we handle fields that are not exactly the width suggested
by the format



I'm not averse to some further study of those issues, and I think the
first two are closely related.  The third one strikes me as a somewhat
separate consideration that doesn't need to be addressed by the same
patch.


If you think those issues are not interrelated, you have not thought
about it carefully enough.

As an example, what we can do to handle not-expected-width fields is
very different if the format is "DDMMYY" versus if it is "DD-MM-YY".
In the first case we have little choice but to believe that each
field is exactly two digits wide.  In the second case, depending on
how we decide to define matching of "-", we might be able to allow
the field widths to vary so that they're effectively "whatever is
between the dashes".  But that would require insisting that "-"
match a "-", or at least a non-alphanumeric, which is not how it
behaves today.

I don't want to twiddle these behaviors in 9.6 and then again next year.

regards, tom lane




Hi,

I want to start work on this patch.

As a conclusion:
- need a decision about three questions:



* what should a space in the format match
* what should a non-space, non-format-code character in the format match
* how should we handle fields that are not exactly the width suggested
by the format


- nobody wants solve this issue in 9.6.

And I have question: what about wrong input in date argument? For 
example, from Alex's message:



postgres=# SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD
HH24:MI:SS');
   to_timestamp

  2016-03-01 15:43:36+03
(1 row)


Here '2016-02-30' is wrong date. I didn't see any conclusion about this 
case in the thread.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-27 Thread Bruce Momjian
On Thu, Jun 23, 2016 at 02:00:49PM -0400, Robert Haas wrote:
> > Ok.  I'm having trouble seeing this justified as a bug fix - the docs
> > clearly state our behavior is intentional.  Improved behavior, yes, but
> > that's a feature and we're in beta2.  Please be specific if you believe I've
> > misinterpreted project policies on this matter.
> 
> I would not vote to back-patch a change in this area because I think
> that could break SQL code that works today.  But I think the current
> behavior is pretty well indefensible.  It's neither intuitive nor
> compatible with Oracle.  And there is plenty of existing precedent for
> making this sort of change during the beta period.  We regard it as a
> bug fix which is too dangerous to back-patch; therefore, it is neither
> subject to feature freeze nor does it go into existing stable
> releases.  Whether to treat this particular issue in that particular
> way is something that needs to be hammered out in discussion.  Tom
> raises the valid point that we need to make sure that we've thoroughly
> studied this issue and fixed all of the related cases before
> committing anything -- we don't want to change the behavior in
> 9.6beta, release 9.6, and then have to change it again for 9.7.  But
> there is certainly no project policy which says we can't change this
> now for 9.6 if we decide that (1) the current behavior is wrong and
> (2) we are quite sure we know how to fix it.

If you are not going to backpatch this for compatibility reasons, I
don't think changing it during beta makes sense either because you open
the chance of breaking applications that have already been tested on
earlier 9.6 betas.  This would only make sense if the  to_timestamp bugs
were new in 9.6.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-27 Thread Robert Haas
On Fri, Jun 24, 2016 at 5:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
>>  wrote:
>>> To me, 2016-02-30 is an invalid date that should generate an error.
>
>> I don't particularly disagree with that, but on the other hand, as
>> mentioned earlier, to_timestamp() is here for Oracle compatibility,
>> and if it doesn't do what Oracle's function does, then (1) it's not
>> useful for people migrating from Oracle and (2) we're making up the
>> behavior out of whole cloth.  I think things that we invent ourselves
>> should reject stuff like this, but in a compatibility function we
>> might want to, say, have compatibility.
>
> Agreed, mostly, but ... how far are we prepared to go on that?  The one
> thing I know about that is different from Oracle and is not something that
> most people would consider clearly wrong is the behavior of the FM prefix.
> We think it's a prefix that modifies only the next format code; they think
> it's a toggle.  If we make that act like Oracle, we will silently break an
> awful lot of applications, and there will be *no way* to write code that
> is correct under both interpretations.  (And no, I do not want to hear
> "let's fix it with a GUC".)  So I'm afraid we're between a rock and a hard
> place on that one --- but if we let that stand, the argument that Oracle's
> to_timestamp should be treated as right by definition loses a lot of air.

Well, I think that you're making several logical jumps that I
personally would decline to make.  First, I don't think every issue
with these functions needs to be handled in the same way as every
other.  Just because we're willing or unwilling to break compatibility
in one area doesn't mean we have to make the same decision in every
case.  We're allowed to take into effect the likely impact of making a
given change in deciding whether it's worth it.  Second, if in one or
more areas we decide that a hard backward compatibility break would be
too painful, then I think it's a good idea to ask ourselves how we
could ease the migration pain for people.  And I'd phrase that as an
open-ended question rather than "should we add a GUC?".

For example, one idea here is to create a to_timstamp_old() function
that retains the existing behavior of to_timestamp() without any
change, and then add a new to_timestamp() function and fix every
Oracle incompatibility we can find as thoroughly as we can do in one
release cycle.  So we fix this whitespace stuff, we fix the FM
modifier, and anything else that comes up, we fix it all.  Then, if
people run into trouble with the new behavior when they upgrade, we
tell them that they can either fix their application or, if they want
the old behavior, they can use to_timestamp_old().  We can also
document the differences between to_timestamp() and to_timestamp_old()
so that people can easily figure out whether those differences are
significant to them.

Another idea is to add an optional third argument to to_timestamp()
that can be used to set compatibility behaviors.

I'm not altogether convinced that it's worth the effort to provide
lots of backward-compatibility here.  Presumably, only a small
percentage of people use to_timestamp(), and only a percentage of
those are going to rely on the details we're talking about changing.
So it might be that if we just up and change this, a few people will
be grumpy and then they'll update their code and that will be it.  On
the other hand, maybe it'll be a real pain in the butt for lots of
people and we'll lose users.  I don't know how to judge how
significant these changes will be to users, and I think that the level
of impact does matter in deciding what to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-25 Thread Steve Crawford
On Fri, Jun 24, 2016 at 3:43 PM, Joshua D. Drake 
wrote:

> On 06/24/2016 02:16 PM, Tom Lane wrote:
>
>> Robert Haas  writes:
>>
>>> On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
>>>  wrote:
>>>
 To me, 2016-02-30 is an invalid date that should generate an error.

>>>
>> I don't particularly disagree with that, but on the other hand, as
>>> mentioned earlier, to_timestamp() is here for Oracle compatibility,
>>> and if it doesn't do what Oracle's function does, then (1) it's not
>>> useful for people migrating from Oracle and (2) we're making up the
>>> behavior out of whole cloth.  I think things that we invent ourselves
>>> should reject stuff like this, but in a compatibility function we
>>> might want to, say, have compatibility.
>>>
>>
>> Agreed, mostly, but ... how far are we prepared to go on that?
>>
>
> We don't at all. Our goal has never been Oracle compatibility. Yes, we
> have "made allowances" but we aren't in a position that requires that
> anymore.
>
> Let's just do it right.
>
> Sincerely,
>
> JD
>
> /me speaking as someone who handles many, many migrations, none of which
> have ever said, "do we have Oracle compatibility available".
>
>
Tongue (partlyish) in cheek:

Developer: I need a database to support my project. Based on my research
this PostgreSQL thing is awesome so we will use it.

PostgreSQL: Welcome to our community!

Developer: I need to convert a string to a timestamp. This to_timestamp()
function I tried does not operate as I expect based on the documentation.

PostgreSQL: Ah, yes, grasshopper. You are young and do not understand the
Things That Must Not Be Documented . In time you will grow a gray ponytail
and/or white beard and learn the history and ways of every database that
came before. Only then will you come to understand how The Functions
*truly* behave.

Developer: Are you #@%!$ kidding me?

I will allow that there may be selected cases where a good argument could
be made for intentionally overly permissive behavior in the pursuit of
compatibility. But in those cases the documentation should specify clearly
and in detail the deviant behavior and reason for its existence.

As one who selected PostgreSQL from the start, I am more interested in the
functions working correctly.

Cheers,
Steve


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Joshua D. Drake

On 06/24/2016 02:16 PM, Tom Lane wrote:

Robert Haas  writes:

On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
 wrote:

To me, 2016-02-30 is an invalid date that should generate an error.



I don't particularly disagree with that, but on the other hand, as
mentioned earlier, to_timestamp() is here for Oracle compatibility,
and if it doesn't do what Oracle's function does, then (1) it's not
useful for people migrating from Oracle and (2) we're making up the
behavior out of whole cloth.  I think things that we invent ourselves
should reject stuff like this, but in a compatibility function we
might want to, say, have compatibility.


Agreed, mostly, but ... how far are we prepared to go on that?


We don't at all. Our goal has never been Oracle compatibility. Yes, we 
have "made allowances" but we aren't in a position that requires that 
anymore.


Let's just do it right.

Sincerely,

JD

/me speaking as someone who handles many, many migrations, none of which 
have ever said, "do we have Oracle compatibility available".


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Gavin Flower

On 25/06/16 08:33, Robert Haas wrote:

On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
 wrote:

My observation has been that the PostgreSQL development group aims for
correctness and the elimination of surprising results. This was part of the
reason to eliminate a number of automatic casts to dates in earlier
versions.

To me, 2016-02-30 is an invalid date that should generate an error.
Automatically and silently changing it to be 2016-03-01 strikes me as a
behavior I'd expect from a certain other open-source database, not
PostgreSQL.

I don't particularly disagree with that, but on the other hand, as
mentioned earlier, to_timestamp() is here for Oracle compatibility,
and if it doesn't do what Oracle's function does, then (1) it's not
useful for people migrating from Oracle and (2) we're making up the
behavior out of whole cloth.  I think things that we invent ourselves
should reject stuff like this, but in a compatibility function we
might want to, say, have compatibility.


How about a to_timestamp_strict(), in addition?

Its very existence will help people (who bother to read the 
documentation) to look more closely at the differences between the 
definitions, and allow them to choose the most appropriate for their use 
case.



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Tom Lane
Robert Haas  writes:
> On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
>  wrote:
>> To me, 2016-02-30 is an invalid date that should generate an error.

> I don't particularly disagree with that, but on the other hand, as
> mentioned earlier, to_timestamp() is here for Oracle compatibility,
> and if it doesn't do what Oracle's function does, then (1) it's not
> useful for people migrating from Oracle and (2) we're making up the
> behavior out of whole cloth.  I think things that we invent ourselves
> should reject stuff like this, but in a compatibility function we
> might want to, say, have compatibility.

Agreed, mostly, but ... how far are we prepared to go on that?  The one
thing I know about that is different from Oracle and is not something that
most people would consider clearly wrong is the behavior of the FM prefix.
We think it's a prefix that modifies only the next format code; they think
it's a toggle.  If we make that act like Oracle, we will silently break an
awful lot of applications, and there will be *no way* to write code that
is correct under both interpretations.  (And no, I do not want to hear
"let's fix it with a GUC".)  So I'm afraid we're between a rock and a hard
place on that one --- but if we let that stand, the argument that Oracle's
to_timestamp should be treated as right by definition loses a lot of air.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Robert Haas
On Fri, Jun 24, 2016 at 12:26 PM, Steve Crawford
 wrote:
> My observation has been that the PostgreSQL development group aims for
> correctness and the elimination of surprising results. This was part of the
> reason to eliminate a number of automatic casts to dates in earlier
> versions.
>
> To me, 2016-02-30 is an invalid date that should generate an error.
> Automatically and silently changing it to be 2016-03-01 strikes me as a
> behavior I'd expect from a certain other open-source database, not
> PostgreSQL.

I don't particularly disagree with that, but on the other hand, as
mentioned earlier, to_timestamp() is here for Oracle compatibility,
and if it doesn't do what Oracle's function does, then (1) it's not
useful for people migrating from Oracle and (2) we're making up the
behavior out of whole cloth.  I think things that we invent ourselves
should reject stuff like this, but in a compatibility function we
might want to, say, have compatibility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Joshua D. Drake

On 06/24/2016 09:26 AM, Steve Crawford wrote:

My observation has been that the PostgreSQL development group aims for
correctness and the elimination of surprising results. This was part of
the reason to eliminate a number of automatic casts to dates in earlier
versions.

To me, 2016-02-30 is an invalid date that should generate an error.
Automatically and silently changing it to be 2016-03-01 strikes me as a
behavior I'd expect from a certain other open-source database, not
PostgreSQL.


I don't think anybody could argue with that in good faith.

Sincerely,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Steve Crawford
My observation has been that the PostgreSQL development group aims for
correctness and the elimination of surprising results. This was part of the
reason to eliminate a number of automatic casts to dates in earlier
versions.

To me, 2016-02-30 is an invalid date that should generate an error.
Automatically and silently changing it to be 2016-03-01 strikes me as a
behavior I'd expect from a certain other open-source database, not
PostgreSQL.

Cheers,
Steve

On Fri, Jun 24, 2016 at 8:52 AM, Alex Ignatov 
wrote:

>
> Alex Ignatov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> On 20.06.2016 17:09, Albe Laurenz wrote:
>
>> Tom Lane wrote:
>>
>>> I don't necessarily have an opinion yet.  I would like to see more than
>>> just an unsupported assertion about what Oracle's behavior is.  Also,
>>> how should FM mode affect this?
>>>
>> I can supply what Oracle 12.1 does:
>>
>> SQL> SELECT to_timestamp('2016-06-13 15:43:36', ' /MM/DD HH24:MI:SS')
>> AS ts FROM dual;
>>
>> TS
>> 
>> 2016-06-13 15:43:36.0 AD
>>
>> SQL> SELECT to_timestamp('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS')
>> AS ts FROM dual;
>>
>> TS
>> 
>> 2016-06-13 15:43:36.0 AD
>>
>> SQL> SELECT to_timestamp('2016-06-1315:43:36', '/MM/DD
>> HH24:MI:SS') AS ts FROM dual;
>>
>> TS
>> 
>> 2016-06-13 15:43:36.0 AD
>>
>> (to_timestamp_tz behaves the same way.)
>>
>> So Oracle seems to make no difference between one or more spaces.
>>
>> Yours,
>> Laurenz Albe
>>
>> Guys, do we need to change this behavior or may be you can tell me that
> is normal because this and this:
>
> postgres=# SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD
> HH24:MI:SS');
>   to_timestamp
> 
>  2016-03-01 15:43:36+03
> (1 row)
>
> but on the other side we have :
>
> postgres=# select '2016-02-30 15:43:36'::timestamp;
> ERROR:  date/time field value out of range: "2016-02-30 15:43:36"
> LINE 1: select '2016-02-30 15:43:36'::timestamp;
>
> Another bug in to_timestamp/date()?
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Alex Ignatov


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

On 20.06.2016 17:09, Albe Laurenz wrote:

Tom Lane wrote:

I don't necessarily have an opinion yet.  I would like to see more than
just an unsupported assertion about what Oracle's behavior is.  Also,
how should FM mode affect this?

I can supply what Oracle 12.1 does:

SQL> SELECT to_timestamp('2016-06-13 15:43:36', ' /MM/DD HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-1315:43:36', '/MM/DD  HH24:MI:SS') AS 
ts FROM dual;

TS

2016-06-13 15:43:36.0 AD

(to_timestamp_tz behaves the same way.)

So Oracle seems to make no difference between one or more spaces.

Yours,
Laurenz Albe

Guys, do we need to change this behavior or may be you can tell me that 
is normal because this and this:


postgres=# SELECT TO_TIMESTAMP('2016-02-30 15:43:36', '-MM-DD 
HH24:MI:SS');

  to_timestamp

 2016-03-01 15:43:36+03
(1 row)

but on the other side we have :

postgres=# select '2016-02-30 15:43:36'::timestamp;
ERROR:  date/time field value out of range: "2016-02-30 15:43:36"
LINE 1: select '2016-02-30 15:43:36'::timestamp;

Another bug in to_timestamp/date()?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-24 Thread Alex Ignatov


On 23.06.2016 20:40, Tom Lane wrote:

Robert Haas  writes:

On Thu, Jun 23, 2016 at 1:12 PM, David G. Johnston
 wrote:

My understanding is that is not going to change for 9.6.

That's exactly what is under discussion here.

I would definitely agree with David on that point.  Making to_timestamp
noticeably better on this score seems like a nontrivial project, and
post-beta is not the time for that sort of thing, even if we had full
consensus on what to do.  I'd suggest somebody work on a patch and put
it up for review in the next cycle.

Now, if you were to narrowly define the problem as "whether to skip
non-spaces for a space in the format", maybe that could be fixed
post-beta, but I think that's a wrongheaded approach.  to_timestamp's
issues with input that doesn't match the format are far wider than that.
IMO we should try to resolve the whole problem with one coherent change,
not make incremental incompatible changes at the margins.

At the very least I'd want to see a thought-through proposal that
addresses all three of these interrelated points:

* what should a space in the format match
* what should a non-space, non-format-code character in the format match
* how should we handle fields that are not exactly the width suggested
by the format

regards, tom lane


Totally agree that we need more discussion about error handling in this 
function!


Also this behavior is observed in to_date() and to_number() function:

postgres=# SELECT 
TO_DATE('2!0!1!6!0!6-/-/-/-/-/-/-1!/-/-/-/-/-/-/-3!', '-MM-DD');

  to_date

 0002-01-01
(1 row)

postgres=# postgres=# select to_number('1$#@!!,2,%,%4,5,@%5@4..8-', 
'999G999D9S');

 to_number
---
12
(1 row)

On the our side we have some discussions about to write a patch that 
will change this incorrect  behavior. So stay tuned.


--

Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread David G. Johnston
On Thu, Jun 23, 2016 at 2:00 PM, Robert Haas  wrote:

> On Thu, Jun 23, 2016 at 1:46 PM, David G. Johnston
>  wrote:
> >> Sheesh.  Who put you in charge of this?  You basically seem like you
> >> are trying to shut up anyone who supports this change, and I don't
> >> think that's right.
> >
> > I'm all for a change in this area - though I'm not impacted enough to
> > actually work on a design proposal.
>
> You weren't for a change in this area a few emails ago; back then, you
> said "I don't see that changing it is a worthwhile endeavor".
>

​I still don't see changing to_timestamp as being the right approach...but
that's just my opinion.  I do think that this area "text to timestamp
conversions" could stand to be improved.  I may not have been clear on that
distinction.​  But changing the behavior of a function that has been around
since 2000 seems to be just asking for trouble.


> > And I'm not sure how asking for ideas
> > constitutes trying to shut people up.  Especially since if no one does
> float
> > a proposal we'll simply have this discussion next year when someone new
> > discovers how badly behaved this function is.
>
> In my opinion, telling people that their emails are not constructive
> and that no change is possible for 9.6 is pretty much the same thing
> as trying to shut people up.  And that's what you did.
>

​I guess I should be a bit more careful to make sure that two separate
responses ​on different aspects of a topic cannot be so easily construed as
"this thread is pointless".  To be clear I did and still do believe that
getting a change in this area into 10.0 is worthwhile; and that our policy
(and present circumstances) appears to preclude this changing in 9.6.  But
as noted below this is just an observation.


> >>  My main point is that I'm inclined to deprecate it.
> >>
> >> I can almost guarantee that would make a lot of users very unhappy.
> >> This function is widely used.
> >
> > Tell people not to use.  We'd leave it in, unchanged, on backward
> > compatibility grounds.  This seems like acceptable behavior for the
> project.
> > Am I mistaken?
>
> Yes.  We're not going to deprecate widely-used functionality.  We
> might, however, fix it, contrary to your representations upthread.
>

​At this point I feel you are cherry-picking my words to fit your present
feelings.  I stand by everything I've written upthread regarding
deprecation and fixing.  I'm personally in favor of the former.  I'll add
that you are a committer, I am not.  The one thing going for change is if
we indeed exactly match the reference behavior and then document it as
being a compatibility function.  I hadn't fully pondered this goal and how
its plays into changing 16 year old behavior.  Obviously a new name for the
function doesn't cut it in this scenario.


> >> > My second point is if you are going to use this badly designed
> function
> >> > you
> >> > need to protect yourself.
> >>
> >> I agree that anyone using this function should test their format
> >> strings carefully.
> >>
> >> > My understanding is that is not going to change for 9.6.
> >>
> >> That's exactly what is under discussion here.
> >>
> >
> > Ok.  I'm having trouble seeing this justified as a bug fix - the docs
> > clearly state our behavior is intentional.  Improved behavior, yes, but
> > that's a feature and we're in beta2.  Please be specific if you believe
> I've
> > misinterpreted project policies on this matter.
>
> I would not vote to back-patch a change in this area because I think
> that could break SQL code that works today.  But I think the current
> behavior is pretty well indefensible.  It's neither intuitive nor
> compatible with Oracle.  And there is plenty of existing precedent for
> making this sort of change during the beta period.  We regard it as a
> bug fix which is too dangerous to back-patch; therefore, it is neither
> subject to feature freeze nor does it go into existing stable
> releases.  Whether to treat this particular issue in that particular
> way is something that needs to be hammered out in discussion.  Tom
> raises the valid point that we need to make sure that we've thoroughly
> studied this issue and fixed all of the related cases before
> committing anything -- we don't want to change the behavior in
> 9.6beta, release 9.6, and then have to change it again for 9.7.  But
> there is certainly no project policy which says we can't change this
> now for 9.6 if we decide that (1) the current behavior is wrong and
> (2) we are quite sure we know how to fix it.
>
>
Thank You.

I still conclude that given the general lack of complaints, the fact we are
at beta2, the age of the feature and nature of the "bug", and the
non-existence of a working patch even for HEAD, that 9.6 is not going to
see this behavior changed and you will be loath to back-patch a fix once
9.6 becomes stable.  I'll admit the possibility does exist but am not all
that keen on couching 

Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Jim Nasby

On 6/23/16 12:29 PM, Alvaro Herrera wrote:

David G. Johnston wrote:

On Thursday, June 23, 2016, Alex Ignatov  wrote:


Arguing just like that one can say that we don't even need exception like
"division by zero". Just use well-formed numbers in denominator...
Input data  sometimes can be generated automagically. Without exception
throwing debugging stored function containing to_timestamp can be painful.


to_timestamp with its present behavior is, IMO, a poorly designed function
that would never be accepted today.


I'm not sure about that.

to_timestamp was added to improve compatibility with Oracle, by commit
b866d2e2d794.  I suppose the spec should follow what's documented here,

http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions193.htm
http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#i34924

and that wherever we deviate from that, is a bug that should be fixed.


+1

I'm also in favor of a parsing function that actually follows the format 
specifier, but not enough to write a patch.


One thing that I think could happen for 9.6 is documenting how you could 
get the desired results with one of the regex functions. Not the nicest 
way to handle this problem, but it is workable and having a regex 
example available for people to start with would be very helpful.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 23, 2016 at 1:40 PM, Tom Lane  wrote:
>> At the very least I'd want to see a thought-through proposal that
>> addresses all three of these interrelated points:
>> 
>> * what should a space in the format match
>> * what should a non-space, non-format-code character in the format match
>> * how should we handle fields that are not exactly the width suggested
>> by the format

> I'm not averse to some further study of those issues, and I think the
> first two are closely related.  The third one strikes me as a somewhat
> separate consideration that doesn't need to be addressed by the same
> patch.

If you think those issues are not interrelated, you have not thought
about it carefully enough.

As an example, what we can do to handle not-expected-width fields is
very different if the format is "DDMMYY" versus if it is "DD-MM-YY".
In the first case we have little choice but to believe that each
field is exactly two digits wide.  In the second case, depending on
how we decide to define matching of "-", we might be able to allow
the field widths to vary so that they're effectively "whatever is
between the dashes".  But that would require insisting that "-"
match a "-", or at least a non-alphanumeric, which is not how it
behaves today.

I don't want to twiddle these behaviors in 9.6 and then again next year.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Robert Haas
On Thu, Jun 23, 2016 at 1:46 PM, David G. Johnston
 wrote:
>> Sheesh.  Who put you in charge of this?  You basically seem like you
>> are trying to shut up anyone who supports this change, and I don't
>> think that's right.
>
> I'm all for a change in this area - though I'm not impacted enough to
> actually work on a design proposal.

You weren't for a change in this area a few emails ago; back then, you
said "I don't see that changing it is a worthwhile endeavor".

> And I'm not sure how asking for ideas
> constitutes trying to shut people up.  Especially since if no one does float
> a proposal we'll simply have this discussion next year when someone new
> discovers how badly behaved this function is.

In my opinion, telling people that their emails are not constructive
and that no change is possible for 9.6 is pretty much the same thing
as trying to shut people up.  And that's what you did.

>>  My main point is that I'm inclined to deprecate it.
>>
>> I can almost guarantee that would make a lot of users very unhappy.
>> This function is widely used.
>
> Tell people not to use.  We'd leave it in, unchanged, on backward
> compatibility grounds.  This seems like acceptable behavior for the project.
> Am I mistaken?

Yes.  We're not going to deprecate widely-used functionality.  We
might, however, fix it, contrary to your representations upthread.

>> > My second point is if you are going to use this badly designed function
>> > you
>> > need to protect yourself.
>>
>> I agree that anyone using this function should test their format
>> strings carefully.
>>
>> > My understanding is that is not going to change for 9.6.
>>
>> That's exactly what is under discussion here.
>>
>
> Ok.  I'm having trouble seeing this justified as a bug fix - the docs
> clearly state our behavior is intentional.  Improved behavior, yes, but
> that's a feature and we're in beta2.  Please be specific if you believe I've
> misinterpreted project policies on this matter.

I would not vote to back-patch a change in this area because I think
that could break SQL code that works today.  But I think the current
behavior is pretty well indefensible.  It's neither intuitive nor
compatible with Oracle.  And there is plenty of existing precedent for
making this sort of change during the beta period.  We regard it as a
bug fix which is too dangerous to back-patch; therefore, it is neither
subject to feature freeze nor does it go into existing stable
releases.  Whether to treat this particular issue in that particular
way is something that needs to be hammered out in discussion.  Tom
raises the valid point that we need to make sure that we've thoroughly
studied this issue and fixed all of the related cases before
committing anything -- we don't want to change the behavior in
9.6beta, release 9.6, and then have to change it again for 9.7.  But
there is certainly no project policy which says we can't change this
now for 9.6 if we decide that (1) the current behavior is wrong and
(2) we are quite sure we know how to fix it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread David G. Johnston
On Thursday, June 23, 2016, Robert Haas  wrote:

> On Thu, Jun 23, 2016 at 1:12 PM, David G. Johnston
> > wrote:
> > to_timestamp with its present behavior is, IMO, a poorly designed
> function
> > that would never be accepted today.  Concrete proposals for either
> fixing or
> > deprecating (or both) are welcome.  Fixing it should not cause
> unnecessary
> > errors to be raised.
>
> Sheesh.  Who put you in charge of this?  You basically seem like you
> are trying to shut up anyone who supports this change, and I don't
> think that's right.


>
I'm all for a change in this area - though I'm not impacted enough to
actually work on a design proposal.  And I'm not sure how asking for ideas
constitutes trying to shut people up.  Especially since if no one does
float a proposal we'll simply have this discussion next year when someone
new discovers how badly behaved this function is.


>  My main point is that I'm inclined to deprecate it.
>

> I can almost guarantee that would make a lot of users very unhappy.
> This function is widely used.
>
>
Tell people not to use.  We'd leave it in, unchanged, on backward
compatibility grounds.  This seems like acceptable behavior for the
project.  Am I mistaken?


> > My second point is if you are going to use this badly designed function
> you
> > need to protect yourself.
>
> I agree that anyone using this function should test their format
> strings carefully.
>
> > My understanding is that is not going to change for 9.6.
>
> That's exactly what is under discussion here.
>
>
Ok.  I'm having trouble seeing this justified as a bug fix - the docs
clearly state our behavior is intentional.  Improved behavior, yes, but
that's a feature and we're in beta2.  Please be specific if you believe
I've misinterpreted project policies on this matter.

David J.


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Robert Haas
On Thu, Jun 23, 2016 at 1:40 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Jun 23, 2016 at 1:12 PM, David G. Johnston
>>  wrote:
>>> My understanding is that is not going to change for 9.6.
>
>> That's exactly what is under discussion here.
>
> I would definitely agree with David on that point.  Making to_timestamp
> noticeably better on this score seems like a nontrivial project, and
> post-beta is not the time for that sort of thing, even if we had full
> consensus on what to do.  I'd suggest somebody work on a patch and put
> it up for review in the next cycle.
>
> Now, if you were to narrowly define the problem as "whether to skip
> non-spaces for a space in the format", maybe that could be fixed
> post-beta, but I think that's a wrongheaded approach.  to_timestamp's
> issues with input that doesn't match the format are far wider than that.
> IMO we should try to resolve the whole problem with one coherent change,
> not make incremental incompatible changes at the margins.
>
> At the very least I'd want to see a thought-through proposal that
> addresses all three of these interrelated points:
>
> * what should a space in the format match
> * what should a non-space, non-format-code character in the format match
> * how should we handle fields that are not exactly the width suggested
> by the format

I'm not averse to some further study of those issues, and I think the
first two are closely related.  The third one strikes me as a somewhat
separate consideration that doesn't need to be addressed by the same
patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 23, 2016 at 1:12 PM, David G. Johnston
>  wrote:
>> My understanding is that is not going to change for 9.6.

> That's exactly what is under discussion here.

I would definitely agree with David on that point.  Making to_timestamp
noticeably better on this score seems like a nontrivial project, and
post-beta is not the time for that sort of thing, even if we had full
consensus on what to do.  I'd suggest somebody work on a patch and put
it up for review in the next cycle.

Now, if you were to narrowly define the problem as "whether to skip
non-spaces for a space in the format", maybe that could be fixed
post-beta, but I think that's a wrongheaded approach.  to_timestamp's
issues with input that doesn't match the format are far wider than that.
IMO we should try to resolve the whole problem with one coherent change,
not make incremental incompatible changes at the margins.

At the very least I'd want to see a thought-through proposal that
addresses all three of these interrelated points:

* what should a space in the format match
* what should a non-space, non-format-code character in the format match
* how should we handle fields that are not exactly the width suggested
by the format

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Alvaro Herrera
David G. Johnston wrote:
> On Thursday, June 23, 2016, Alex Ignatov  wrote:
> 
> > Arguing just like that one can say that we don't even need exception like
> > "division by zero". Just use well-formed numbers in denominator...
> > Input data  sometimes can be generated automagically. Without exception
> > throwing debugging stored function containing to_timestamp can be painful.
> 
> to_timestamp with its present behavior is, IMO, a poorly designed function
> that would never be accepted today.

I'm not sure about that.

to_timestamp was added to improve compatibility with Oracle, by commit
b866d2e2d794.  I suppose the spec should follow what's documented here,

http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions193.htm
http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#i34924

and that wherever we deviate from that, is a bug that should be fixed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Robert Haas
On Thu, Jun 23, 2016 at 1:12 PM, David G. Johnston
 wrote:
> to_timestamp with its present behavior is, IMO, a poorly designed function
> that would never be accepted today.  Concrete proposals for either fixing or
> deprecating (or both) are welcome.  Fixing it should not cause unnecessary
> errors to be raised.

Sheesh.  Who put you in charge of this?  You basically seem like you
are trying to shut up anyone who supports this change, and I don't
think that's right.  Alex's opinion is just as valid as yours -
neither more nor less - and he has every right to express it without
being told by you that his emails are "not constructive".

> My main point is that I'm inclined to deprecate it.

I can almost guarantee that would make a lot of users very unhappy.
This function is widely used.

> My second point is if you are going to use this badly designed function you
> need to protect yourself.

I agree that anyone using this function should test their format
strings carefully.

> My understanding is that is not going to change for 9.6.

That's exactly what is under discussion here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Robert Haas
On Thu, Jun 23, 2016 at 12:37 PM, David G. Johnston
 wrote
> To be honest I don't see how this is relevant to quoted content.  And you've
> already made this point quite clearly - repeating it isn't constructive.
> This behavior has existed for a long time and I don't see that changing it
> is a worthwhile endeavor.  I believe a new function is required that has
> saner behavior.  Otherwise given good input and a well-formed parse string
> the function does exactly what it is designed to do.  Avoid giving it
> garbage and you will be fine.  Maybe wrap the call to the in a function that
> also checks for the expected layout and RAISE EXCEPTION if it doesn't match.

Well, I think other people are allowed to disagree about whether
changing it is a worthwhile endeavor.  I think there's an excellent
argument that the current behavior is stupid and broken and probably
nobody is intentionally relying on it.

Obviously, if somebody is relying on the existing behavior and we
change it and it breaks things, then that's bad, and a good argument
for worrying about backward-compatibility - e.g. by adding a new
function.  But who would actually like the behavior that an extra
space in the format string causes non-whitespace characters to get
skipped?  Next you'll be arguing that we can't fix a server crash
that's triggered by a certain query because somebody might be relying
on that query to restart the server...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread David G. Johnston
On Thursday, June 23, 2016, Alex Ignatov  wrote:

> Arguing just like that one can say that we don't even need exception like
> "division by zero". Just use well-formed numbers in denominator...
> Input data  sometimes can be generated automagically. Without exception
> throwing debugging stored function containing to_timestamp can be painful.
>

to_timestamp with its present behavior is, IMO, a poorly designed function
that would never be accepted today.  Concrete proposals for either fixing
or deprecating (or both) are welcome.  Fixing it should not
cause unnecessary errors to be raised.

My main point is that I'm inclined to deprecate it.

My second point is if you are going to use this badly designed function you
need to protect yourself.

My understanding is that is not going to change for 9.6.

David J.


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Alex Ignatov


On 23.06.2016 19:37, David G. Johnston wrote:
On Thu, Jun 23, 2016 at 12:16 PM, Alex Ignatov 
>wrote:



On 23.06.2016 16:30, Bruce Momjian wrote:

On Thu, Jun 23, 2016 at 07:41:26AM +, amul sul wrote:

On Monday, 20 June 2016 8:53 PM, Alex Ignatov
> wrote:


On 13.06.2016 18:52, amul sul wrote:

And it wont stop on some simple whitespace. By using
to_timestamp you
can get any output results by providing illegal input
parameters values:
postgres=# SELECT TO_TIMESTAMP('2016-06-13 99
:99:99', 'MMDD
HH24:MI:SS');
   to_timestamp

  2016-01-06 14:40:39+03

(1 row)

We do consume extra space from input string, but not if it
is in format string, see below:

postgres=# SELECT TO_TIMESTAMP('2016-06-13 15:43:36',
'/MM/DD HH24:MI:SS');
to_timestamp

2016-06-13 15:43:36-07
(1 row)

We should have same treatment for format string too.

Thoughts? Comments?

Well, the user specifies the format string, while the input
string comes
from the data, so I don't see having them behave the same as
necessary.


To be honest they not just behave differently. to_timestamp is
just incorrectly  handles input data and nothing else.There is no
excuse for such behavior:

postgres=# SELECT TO_TIMESTAMP('20:-16-06:13: 15_43:!36',
'/MM/DD HH24:MI:SS');
 to_timestamp
--
 0018-08-05 13:15:43+02:30:17
(1 row)


T
​o be honest I don't see how this is relevant to quoted content.  And 
you've already made this point quite clearly - repeating it isn't 
constructive.  This behavior has existed for a long time and I don't 
see that changing it is a worthwhile endeavor.  I believe a new 
function is required that has saner behavior. Otherwise given good 
input and a well-formed parse string the function does exactly what it 
is designed to do.  Avoid giving it garbage and you will be fine. 
Maybe wrap the call to the in a function that also checks for the 
expected layout and RAISE EXCEPTION if it doesn't match.


​David J.
​
​
Arguing just like that one can say that we don't even need exception 
like "division by zero". Just use well-formed numbers in denominator...
Input data  sometimes can be generated automagically. Without exception 
throwing debugging stored function containing to_timestamp can be painful.




Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread David G. Johnston
On Thu, Jun 23, 2016 at 12:16 PM, Alex Ignatov 
wrote:

>
> On 23.06.2016 16:30, Bruce Momjian wrote:
>
>> On Thu, Jun 23, 2016 at 07:41:26AM +, amul sul wrote:
>>
>>> On Monday, 20 June 2016 8:53 PM, Alex Ignatov 
>>> wrote:
>>>
>>>
>>> On 13.06.2016 18:52, amul sul wrote:
>
 And it wont stop on some simple whitespace. By using to_timestamp you
 can get any output results by providing illegal input parameters values:
 postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD
 HH24:MI:SS');
to_timestamp
 
   2016-01-06 14:40:39+03

 (1 row)

>>> We do consume extra space from input string, but not if it is in format
>>> string, see below:
>>>
>>> postgres=# SELECT TO_TIMESTAMP('2016-06-13  15:43:36', '/MM/DD
>>> HH24:MI:SS');
>>> to_timestamp
>>> 
>>> 2016-06-13 15:43:36-07
>>> (1 row)
>>>
>>> We should have same treatment for format string too.
>>>
>>> Thoughts? Comments?
>>>
>> Well, the user specifies the format string, while the input string comes
>> from the data, so I don't see having them behave the same as necessary.
>>
>>
> To be honest they not just behave differently.  to_timestamp is just
> incorrectly  handles input data and nothing else.There is no excuse for
> such behavior:
>
> postgres=# SELECT TO_TIMESTAMP('20:-16-06:13: 15_43:!36', '/MM/DD
> HH24:MI:SS');
>  to_timestamp
> --
>  0018-08-05 13:15:43+02:30:17
> (1 row)
>

T
​o be honest I don't see how this is relevant to quoted content.  And
you've already made this point quite clearly - repeating it isn't
constructive.  This behavior has existed for a long time and I don't see
that changing it is a worthwhile endeavor.  I believe a new function is
required that has saner behavior.  Otherwise given good input and a
well-formed parse string the function does exactly what it is designed to
do.  Avoid giving it garbage and you will be fine.  Maybe wrap the call to
the in a function that also checks for the expected layout and RAISE
EXCEPTION if it doesn't match.

​David J.
​
​


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Alex Ignatov


On 23.06.2016 16:30, Bruce Momjian wrote:

On Thu, Jun 23, 2016 at 07:41:26AM +, amul sul wrote:

On Monday, 20 June 2016 8:53 PM, Alex Ignatov  wrote:



On 13.06.2016 18:52, amul sul wrote:

And it wont stop on some simple whitespace. By using to_timestamp you
can get any output results by providing illegal input parameters values:
postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD
HH24:MI:SS');
   to_timestamp

  2016-01-06 14:40:39+03

(1 row)

We do consume extra space from input string, but not if it is in format string, 
see below:

postgres=# SELECT TO_TIMESTAMP('2016-06-13  15:43:36', '/MM/DD 
HH24:MI:SS');
to_timestamp

2016-06-13 15:43:36-07
(1 row)

We should have same treatment for format string too.

Thoughts? Comments?

Well, the user specifies the format string, while the input string comes
from the data, so I don't see having them behave the same as necessary.



To be honest they not just behave differently.  to_timestamp is just 
incorrectly  handles input data and nothing else.There is no excuse for 
such behavior:


postgres=# SELECT TO_TIMESTAMP('20:-16-06:13: 15_43:!36', '/MM/DD 
HH24:MI:SS');

 to_timestamp
--
 0018-08-05 13:15:43+02:30:17
(1 row)



Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread Bruce Momjian
On Thu, Jun 23, 2016 at 07:41:26AM +, amul sul wrote:
> On Monday, 20 June 2016 8:53 PM, Alex Ignatov  
> wrote:
> 
> 
> >>On 13.06.2016 18:52, amul sul wrote:
> >And it wont stop on some simple whitespace. By using to_timestamp you 
> >can get any output results by providing illegal input parameters values:
> 
> >postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
> >HH24:MI:SS');
> >   to_timestamp
> >
> >  2016-01-06 14:40:39+03
> >
> > (1 row)
> 
> We do consume extra space from input string, but not if it is in format 
> string, see below:
> 
> postgres=# SELECT TO_TIMESTAMP('2016-06-13  15:43:36', '/MM/DD 
> HH24:MI:SS'); 
> to_timestamp 
> 
> 2016-06-13 15:43:36-07
> (1 row)
> 
> We should have same treatment for format string too.
> 
> Thoughts? Comments?

Well, the user specifies the format string, while the input string comes
from the data, so I don't see having them behave the same as necessary.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-23 Thread amul sul
On Monday, 20 June 2016 8:53 PM, Alex Ignatov  wrote:


>>On 13.06.2016 18:52, amul sul wrote:
>And it wont stop on some simple whitespace. By using to_timestamp you 
>can get any output results by providing illegal input parameters values:

>postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
>HH24:MI:SS');
>   to_timestamp
>
>  2016-01-06 14:40:39+03
>
> (1 row)

We do consume extra space from input string, but not if it is in format string, 
see below:

postgres=# SELECT TO_TIMESTAMP('2016-06-13  15:43:36', '/MM/DD 
HH24:MI:SS'); 
to_timestamp 

2016-06-13 15:43:36-07
(1 row)

We should have same treatment for format string too.

Thoughts? Comments?

Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Alex Ignatov


On 13.06.2016 18:52, amul sul wrote:

Hi,

It's look like bug in to_timestamp() function when format string has more 
whitespaces compare to input string, see below:

Ex.1: Two white spaces before HH24 whereas one before input time string

postgres=# SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS');
to_timestamp

2016-06-13 05:43:36-07   <— incorrect time
(1 row)



Ex.2: One whitespace before  format string

postgres=# SELECT TO_TIMESTAMP('2016/06/13 15:43:36', ' /MM/DD HH24:MI:SS');
to_timestamp
--
0016-06-13 15:43:36-07:52:58  <— incorrect year
(1 row)



If there are one or more consecutive whitespace in the format, we should skip 
those as long as we could get an actual field.
Thoughts?
Thanks & Regards,
Amul Sul




From docs about to_timestamp() ( 
https://www.postgresql.org/docs/9.5/static/functions-formatting.html)
"These functions interpret input liberally, with minimal error checking. 
While they produce valid output, the conversion can yield unexpected 
results. For example, input to these functions is not restricted by 
normal ranges, thus to_date('20096040','MMDD') returns 2014-01-17 
rather than causing an error. Casting does not have this behavior."


And it wont stop on some simple whitespace. By using to_timestamp you 
can get any output results by providing illegal input parameters values:


postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
HH24:MI:SS');

  to_timestamp

 2016-01-06 14:40:39+03
(1 row)


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Alex Ignatov


On 20.06.2016 16:36, Tom Lane wrote:

Robert Haas  writes:

On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:

I think a space in the format string should skip a whitespace
character in the input string, but not a non-whitespace character.
It's my understanding that these functions exist in no small part for
compatibility with Oracle, and Oracle declines to skip the digit '1'
on the basis of an extra space in the format string, which IMHO is the
behavior any reasonable user would expect.

So Amul and I are of one opinion and Tom is of another.  Anyone else
have an opinion?

I don't necessarily have an opinion yet.  I would like to see more than
just an unsupported assertion about what Oracle's behavior is.  Also,
how should FM mode affect this?

regards, tom lane




Oracle:
SQL> SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD HH24:MI:SS') 
from dual;

SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD HH24:MI:SS') from dual
*
ERROR at line 1:
ORA-01843: not a valid month

PG:

postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
HH24:MI:SS');

  to_timestamp

 2016-01-06 14:40:39+03
(1 row)


I know about:
"These functions interpret input liberally, with minimal error checking. 
While they produce valid output, the conversion can yield unexpected 
results" from docs but by providing illegal input parameters  we have no 
any exceptions or errors about that.
I think that to_timestamp() need to has more format checking than it has 
now.


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Albe Laurenz
Tom Lane wrote:
> I don't necessarily have an opinion yet.  I would like to see more than
> just an unsupported assertion about what Oracle's behavior is.  Also,
> how should FM mode affect this?

I can supply what Oracle 12.1 does:

SQL> SELECT to_timestamp('2016-06-13 15:43:36', ' /MM/DD HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-1315:43:36', '/MM/DD  HH24:MI:SS') AS 
ts FROM dual;

TS

2016-06-13 15:43:36.0 AD

(to_timestamp_tz behaves the same way.)

So Oracle seems to make no difference between one or more spaces.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:
>> I think a space in the format string should skip a whitespace
>> character in the input string, but not a non-whitespace character.
>> It's my understanding that these functions exist in no small part for
>> compatibility with Oracle, and Oracle declines to skip the digit '1'
>> on the basis of an extra space in the format string, which IMHO is the
>> behavior any reasonable user would expect.

> So Amul and I are of one opinion and Tom is of another.  Anyone else
> have an opinion?

I don't necessarily have an opinion yet.  I would like to see more than
just an unsupported assertion about what Oracle's behavior is.  Also,
how should FM mode affect this?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 8:19 AM, Robert Haas  wrote:

> On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas 
> wrote:
> > On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
> >> amul sul  writes:
> >>> It's look like bug in to_timestamp() function when format string has
> more whitespaces compare to input string, see below:
> >>
> >> No, I think this is a case of "input doesn't match the format string".
> >>
> >> As a rule of thumb, using to_timestamp() for input that could be parsed
> >> just fine by the standard timestamp input function is not a particularly
> >> good idea.  to_timestamp() is meant to deal with input that is in a
> >> well-defined format that happens to not be parsable by timestamp_in.
> >> This example doesn't meet either of those preconditions.
> >
> > I think a space in the format string should skip a whitespace
> > character in the input string, but not a non-whitespace character.
> > It's my understanding that these functions exist in no small part for
> > compatibility with Oracle, and Oracle declines to skip the digit '1'
> > on the basis of an extra space in the format string, which IMHO is the
> > behavior any reasonable user would expect.
>
> So Amul and I are of one opinion and Tom is of another.  Anyone else
> have an opinion?
>
>
​At least Tom's position has the benefit of being consistent with current
behavior.  The current implementation doesn't actually care what literal
value you specify - any non-special character consumes a single token from
the input, period.

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD--HH24:MI:SS');
SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD-HH24:MI:SS');

Both the above exhibit the same behavior as if you used a space instead of
the hyphen as the group separator.

The documentation should be updated to make this particular dynamic more
clear.

I don't see changing the general behavior of these "date formatting"
functions a worthwhile endeavor.​  Adding a less-liberal "parse_timestamp"
function I could get behind.

IOW, the user seems to be happy with the fact that the "/" in the date
matches his "-" but them complains that the space matches the number "1".
You don't get to have it both ways.

[re-reads the third usage note]

Or maybe you do.  We already define space as a being a greedy operator
(when not used in conjunction with FX).  A greedy space-space sequence
makes little sense on its face and if we are going to be helpful here we
should treat it as a single greedy space matcher.

Note that "returns an error because to_timestamp expects one space only" is
wrong - it errors because only a single space is captured and then the
attempt to parse 'JUN' using "MON" fails.  The following query doesn't
fail though it exhibits the same space discrepancy (it just gives the same
"wrong" result).

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', 'FX/MM/DD  HH24:MI:SS');

Given that we already partially special-case the space expression I'd be
inclined to consider Robert's and Amul's position on the matter.  I think
I'd redefine our treatment of space to be "zero or more" instead of "one or
more" and require that it only match a literal space in the input.

Having considered that, I'm not convinced its worth a compatibility break.
I'd much rather deprecate these  versions and write
slightly-less-liberal versions named .

In any case I'd called the present wording a bug. Instead:

A single space consumes a single token of input and then, unless the FX
modifier is present, consumes zero or more subsequent literal spaces.
Thus, using two spaces in a row without the FX modifier, while allowed, is
unlikely to give you a satisfactory result.  The first space will consume
all available consecutive spaces so that the second space will be
guaranteed to consume a non-space token from the input.

David J.


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Robert Haas
On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:
> On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
>> amul sul  writes:
>>> It's look like bug in to_timestamp() function when format string has more 
>>> whitespaces compare to input string, see below:
>>
>> No, I think this is a case of "input doesn't match the format string".
>>
>> As a rule of thumb, using to_timestamp() for input that could be parsed
>> just fine by the standard timestamp input function is not a particularly
>> good idea.  to_timestamp() is meant to deal with input that is in a
>> well-defined format that happens to not be parsable by timestamp_in.
>> This example doesn't meet either of those preconditions.
>
> I think a space in the format string should skip a whitespace
> character in the input string, but not a non-whitespace character.
> It's my understanding that these functions exist in no small part for
> compatibility with Oracle, and Oracle declines to skip the digit '1'
> on the basis of an extra space in the format string, which IMHO is the
> behavior any reasonable user would expect.

So Amul and I are of one opinion and Tom is of another.  Anyone else
have an opinion?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-15 Thread amul sul
On Monday, 13 June 2016 9:57 PM, Robert Haas  wrote:


>I think a space in the format string should skip a whitespace
>character in the input string, but not a non-whitespace character.

+1, enough the benefit is we are giving correct output. 
 
PFA patch proposing this fix.

Regards,
Amul Sul. 


0001-RM37358-space-in-the-format-string-should-skip-a-whi.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
> amul sul  writes:
>> It's look like bug in to_timestamp() function when format string has more 
>> whitespaces compare to input string, see below:
>
> No, I think this is a case of "input doesn't match the format string".
>
> As a rule of thumb, using to_timestamp() for input that could be parsed
> just fine by the standard timestamp input function is not a particularly
> good idea.  to_timestamp() is meant to deal with input that is in a
> well-defined format that happens to not be parsable by timestamp_in.
> This example doesn't meet either of those preconditions.

I think a space in the format string should skip a whitespace
character in the input string, but not a non-whitespace character.
It's my understanding that these functions exist in no small part for
compatibility with Oracle, and Oracle declines to skip the digit '1'
on the basis of an extra space in the format string, which IMHO is the
behavior any reasonable user would expect.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-13 Thread Tom Lane
amul sul  writes:
> It's look like bug in to_timestamp() function when format string has more 
> whitespaces compare to input string, see below: 

No, I think this is a case of "input doesn't match the format string".

As a rule of thumb, using to_timestamp() for input that could be parsed
just fine by the standard timestamp input function is not a particularly
good idea.  to_timestamp() is meant to deal with input that is in a
well-defined format that happens to not be parsable by timestamp_in.
This example doesn't meet either of those preconditions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in to_timestamp().

2016-06-13 Thread amul sul
Hi,

It's look like bug in to_timestamp() function when format string has more 
whitespaces compare to input string, see below: 

Ex.1: Two white spaces before HH24 whereas one before input time string

postgres=# SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS');
to_timestamp 

2016-06-13 05:43:36-07   <— incorrect time 
(1 row)



Ex.2: One whitespace before  format string

postgres=# SELECT TO_TIMESTAMP('2016/06/13 15:43:36', ' /MM/DD HH24:MI:SS');
to_timestamp 
--
0016-06-13 15:43:36-07:52:58  <— incorrect year
(1 row)



If there are one or more consecutive whitespace in the format, we should skip 
those as long as we could get an actual field.
Thoughts?
Thanks & Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers