[ 
https://issues.apache.org/jira/browse/SPARK-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-18610:
---------------------------------
    Description: 
It seems Spark SQL fails to implicitly cast (or detect widen type) from string 
with date/timestamp.

{code}
spark-sql> select greatest("2015-02-02", date("2015-01-01")) ;
Error in query: cannot resolve 'greatest('2015-02-02', CAST('2015-01-01' AS 
DATE))' due to data type mismatch: The expressions should all have the same 
type, got GREATEST(string, date).; line 1 pos 7
{code}

It seems, at least, other DBMS support this by implicit casting/widened types.

{code}
hive> select greatest("2015-02-02", date("2015-01-01"));
OK
2015-02-02
hive> select greatest("2015-02-021", date("2015-01-01"));
OK
2015-01-01
Time taken: 0.019 seconds, Fetched: 1 row(s)
hive> select greatest("9999-02-021", date("2015-01-01"));
OK
2015-01-01
Time taken: 0.02 seconds, Fetched: 1 row(s)
hive>
hive> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
date("2015-01-01"));
OK
Time taken: 2.63 seconds
hive> DESCRIBE typeof;
OK
_c0                     date
Time taken: 0.031 seconds, Fetched: 1 row(s)
{code}

{code}
mysql> select greatest("2015-02-02abc", date("2015-01-01"));
+-----------------------------------------------+
| greatest("2015-02-02abc", date("2015-01-01")) |
+-----------------------------------------------+
| 2015-02-02abc                                 |
+-----------------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
date("2015-01-01"));
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> DESCRIBE typeof;
+--------------------------------------------+-------------+------+-----+---------+-------+
| Field                                      | Type        | Null | Key | 
Default | Extra |
+--------------------------------------------+-------------+------+-----+---------+-------+
| greatest("2015-02-02", date("2015-01-01")) | varchar(10) | YES  |     | NULL  
  |       |
+--------------------------------------------+-------------+------+-----+---------+-------+
1 row in set (0.00 sec)
{code}

{code}
postgres=# select greatest('2015-02-02abc', date('2015-01-01'));
ERROR:  invalid input syntax for type date: "2015-02-02abc"
LINE 1: select greatest('2015-02-02abc', date('2015-01-01'));

postgres=# CREATE TEMPORARY TABLE typeof as select greatest('2015-02-02', 
date('2015-01-01'));
SELECT 1

postgres=# \d+ typeof
                      Table "pg_temp_3.typeof"
  Column  | Type | Modifiers | Storage | Stats target | Description
----------+------+-----------+---------+--------------+-------------
 greatest | date |           | plain   |              |
Has OIDs: no
{code}

I tracked down and it seems we want Hive's behaviour assuming from SPARK-12201.

  was:
It seems Spark SQL fails to implicitly cast (or detect widen type) from string 
with date/timestamp.

{code}
spark-sql> select greatest("2015-02-02", date("2015-01-01")) ;
Error in query: cannot resolve 'greatest('2015-02-02', CAST('2015-01-01' AS 
DATE))' due to data type mismatch: The expressions should all have the same 
type, got GREATEST(string, date).; line 1 pos 7
{code}

It seems, at least, other DBMS support this by implicit casting/widened types.

{code}
hive> select greatest("2015-02-021", date("2015-01-01"));
OK
2015-01-01
Time taken: 0.019 seconds, Fetched: 1 row(s)
hive> select greatest("9999-02-021", date("2015-01-01"));
OK
2015-01-01
Time taken: 0.02 seconds, Fetched: 1 row(s)
hive>
hive> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
date("2015-01-01"));
OK
Time taken: 2.63 seconds
hive> DESCRIBE typeof;
OK
_c0                     date
Time taken: 0.031 seconds, Fetched: 1 row(s)
{code}

{code}
mysql> select greatest("2015-02-02abc", date("2015-01-01"));
+-----------------------------------------------+
| greatest("2015-02-02abc", date("2015-01-01")) |
+-----------------------------------------------+
| 2015-02-02abc                                 |
+-----------------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
date("2015-01-01"));
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> DESCRIBE typeof;
+--------------------------------------------+-------------+------+-----+---------+-------+
| Field                                      | Type        | Null | Key | 
Default | Extra |
+--------------------------------------------+-------------+------+-----+---------+-------+
| greatest("2015-02-02", date("2015-01-01")) | varchar(10) | YES  |     | NULL  
  |       |
+--------------------------------------------+-------------+------+-----+---------+-------+
1 row in set (0.00 sec)
{code}

{code}
postgres=# select greatest('2015-02-02abc', date('2015-01-01'));
ERROR:  invalid input syntax for type date: "2015-02-02abc"
LINE 1: select greatest('2015-02-02abc', date('2015-01-01'));

postgres=# CREATE TEMPORARY TABLE typeof as select greatest('2015-02-02', 
date('2015-01-01'));
SELECT 1

postgres=# \d+ typeof
                      Table "pg_temp_3.typeof"
  Column  | Type | Modifiers | Storage | Stats target | Description
----------+------+-----------+---------+--------------+-------------
 greatest | date |           | plain   |              |
Has OIDs: no
{code}

I tracked down and it seems we want Hive's behaviour assuming from SPARK-12201.


> greatest/leatest fails to run with string aginst date/timestamp
> ---------------------------------------------------------------
>
>                 Key: SPARK-18610
>                 URL: https://issues.apache.org/jira/browse/SPARK-18610
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Hyukjin Kwon
>
> It seems Spark SQL fails to implicitly cast (or detect widen type) from 
> string with date/timestamp.
> {code}
> spark-sql> select greatest("2015-02-02", date("2015-01-01")) ;
> Error in query: cannot resolve 'greatest('2015-02-02', CAST('2015-01-01' AS 
> DATE))' due to data type mismatch: The expressions should all have the same 
> type, got GREATEST(string, date).; line 1 pos 7
> {code}
> It seems, at least, other DBMS support this by implicit casting/widened types.
> {code}
> hive> select greatest("2015-02-02", date("2015-01-01"));
> OK
> 2015-02-02
> hive> select greatest("2015-02-021", date("2015-01-01"));
> OK
> 2015-01-01
> Time taken: 0.019 seconds, Fetched: 1 row(s)
> hive> select greatest("9999-02-021", date("2015-01-01"));
> OK
> 2015-01-01
> Time taken: 0.02 seconds, Fetched: 1 row(s)
> hive>
> hive> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
> date("2015-01-01"));
> OK
> Time taken: 2.63 seconds
> hive> DESCRIBE typeof;
> OK
> _c0                   date
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> {code}
> {code}
> mysql> select greatest("2015-02-02abc", date("2015-01-01"));
> +-----------------------------------------------+
> | greatest("2015-02-02abc", date("2015-01-01")) |
> +-----------------------------------------------+
> | 2015-02-02abc                                 |
> +-----------------------------------------------+
> 1 row in set, 1 warning (0.00 sec)
> mysql> CREATE TEMPORARY TABLE typeof as select greatest("2015-02-02", 
> date("2015-01-01"));
> Query OK, 1 row affected (0.01 sec)
> Records: 1  Duplicates: 0  Warnings: 0
> mysql> DESCRIBE typeof;
> +--------------------------------------------+-------------+------+-----+---------+-------+
> | Field                                      | Type        | Null | Key | 
> Default | Extra |
> +--------------------------------------------+-------------+------+-----+---------+-------+
> | greatest("2015-02-02", date("2015-01-01")) | varchar(10) | YES  |     | 
> NULL    |       |
> +--------------------------------------------+-------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
> {code}
> {code}
> postgres=# select greatest('2015-02-02abc', date('2015-01-01'));
> ERROR:  invalid input syntax for type date: "2015-02-02abc"
> LINE 1: select greatest('2015-02-02abc', date('2015-01-01'));
> postgres=# CREATE TEMPORARY TABLE typeof as select greatest('2015-02-02', 
> date('2015-01-01'));
> SELECT 1
> postgres=# \d+ typeof
>                       Table "pg_temp_3.typeof"
>   Column  | Type | Modifiers | Storage | Stats target | Description
> ----------+------+-----------+---------+--------------+-------------
>  greatest | date |           | plain   |              |
> Has OIDs: no
> {code}
> I tracked down and it seems we want Hive's behaviour assuming from 
> SPARK-12201.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to