[ 
https://issues.apache.org/jira/browse/IMPALA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fifteen updated IMPALA-9922:
----------------------------
    Description: 
When I tried to convert `string` to `timestamp` with `to_timestamp()` function, 
I got some unexpected NULL values . So I am writing this issue to seek for your 
help. 

Currently, the built-in function `to_timestamp()` return NULL when input's 
length is not equal to the length of format string describes. For example, the 
following query returns `NULL`:
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 18:00:00.12","yyyy-MM-dd 
HH:mm:ss.SSS")
> NULL{code}
While this query returns converted value:
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 
18:00:00.123","yyyy-MM-dd HH:mm:ss.SSS") 
> 2020-01-01 18:00:00.123000000{code}
 

Snippet below explains the relative logic. The file name is 
`cast-functions-ir.cc` 
{code:java}
bool ParseDateTime(const char* str, int str_len, const DateTimeFormatContext& 
dt_ctx,
    DateTimeParseResult* dt_result) {
  DCHECK(dt_ctx.fmt_len > 0);
  DCHECK(dt_ctx.toks.size() > 0);
  DCHECK(dt_result != NULL);
  //-------------------------------------------------------
  // if str_len < fmt_len, Parse fail and return NULL 
 //-------------------------------------------------------
  if (str_len <= 0 || str_len < dt_ctx.fmt_len || str == NULL) return false; 
  StringParser::ParseResult status;
  ...
{code}
h3.  *My proposal* 

Will it be better if  function accepts the shorter input and returns converted 
timestamp with some padding zeros?  i.e. returns 2020-01-01 18:00:00.012000000 
with the following sql
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 18:00:00.12","yyyy-MM-dd 
HH:mm:ss.SSS")
{code}
 

 

  was:
When I tried to convert `string` to `timestamp` with `to_timestamp()` function, 
I got some NULL values which are unexpected. So I am writing this issue to seek 
for your help. 

Currently, the built-in function `to_timestamp()` return NULL when input's 
length is not equal to the length of format string describes. For example, the 
following query returns `NULL`:
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 18:00:00.12","yyyy-MM-dd 
HH:mm:ss.SSS")
> NULL{code}
While this query returns converted value:
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 
18:00:00.123","yyyy-MM-dd HH:mm:ss.SSS") 
> 2020-01-01 18:00:00.123000000{code}
 

Snippet below explains the relative logic. The file name is 
`cast-functions-ir.cc` 
{code:java}
bool ParseDateTime(const char* str, int str_len, const DateTimeFormatContext& 
dt_ctx,
    DateTimeParseResult* dt_result) {
  DCHECK(dt_ctx.fmt_len > 0);
  DCHECK(dt_ctx.toks.size() > 0);
  DCHECK(dt_result != NULL);
  //-------------------------------------------------------
  // if str_len < fmt_len, Parse fail and return NULL 
 //-------------------------------------------------------
  if (str_len <= 0 || str_len < dt_ctx.fmt_len || str == NULL) return false; 
  StringParser::ParseResult status;
  ...
{code}
h3.  *My proposal* 

Will it be better if  function accepts the shorter input and returns converted 
timestamp with some padding zeros?  i.e. returns 2020-01-01 18:00:00.012000000 
with the following sql
{code:java}
[impala-shell]default> select to_timestamp("2020-01-01 18:00:00.12","yyyy-MM-dd 
HH:mm:ss.SSS")
{code}
 

 


> Is this a better approach to deal with malformed input in 'to_timestamp()'
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-9922
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9922
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>            Reporter: fifteen
>            Priority: Minor
>
> When I tried to convert `string` to `timestamp` with `to_timestamp()` 
> function, I got some unexpected NULL values . So I am writing this issue to 
> seek for your help. 
> Currently, the built-in function `to_timestamp()` return NULL when input's 
> length is not equal to the length of format string describes. For example, 
> the following query returns `NULL`:
> {code:java}
> [impala-shell]default> select to_timestamp("2020-01-01 
> 18:00:00.12","yyyy-MM-dd HH:mm:ss.SSS")
> > NULL{code}
> While this query returns converted value:
> {code:java}
> [impala-shell]default> select to_timestamp("2020-01-01 
> 18:00:00.123","yyyy-MM-dd HH:mm:ss.SSS") 
> > 2020-01-01 18:00:00.123000000{code}
>  
> Snippet below explains the relative logic. The file name is 
> `cast-functions-ir.cc` 
> {code:java}
> bool ParseDateTime(const char* str, int str_len, const DateTimeFormatContext& 
> dt_ctx,
>     DateTimeParseResult* dt_result) {
>   DCHECK(dt_ctx.fmt_len > 0);
>   DCHECK(dt_ctx.toks.size() > 0);
>   DCHECK(dt_result != NULL);
>   //-------------------------------------------------------
>   // if str_len < fmt_len, Parse fail and return NULL 
>  //-------------------------------------------------------
>   if (str_len <= 0 || str_len < dt_ctx.fmt_len || str == NULL) return false; 
>   StringParser::ParseResult status;
>   ...
> {code}
> h3.  *My proposal* 
> Will it be better if  function accepts the shorter input and returns 
> converted timestamp with some padding zeros?  i.e. returns 2020-01-01 
> 18:00:00.012000000 with the following sql
> {code:java}
> [impala-shell]default> select to_timestamp("2020-01-01 
> 18:00:00.12","yyyy-MM-dd HH:mm:ss.SSS")
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to