[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:18 AM: -- mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null that is mysql both return null when the date is invalidate or the formate is invalidate. and hive will transform the invalidate date to valid, e.g 2014-31-12 -> 31/12 = 2 -> 2014+2=2016 , 31 - 12*2=7 -> 2016-07-12 currently spark can handle wrong format / wrong date when to_date has the format parameter (like hive's transform), what about we also make to_date without format parameter follow its action, that is replace null with a transformed date to return was (Author: windpiger): mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null that is mysql both return null when the date is invalidate or the formate is invalidate. and hive will transform the invalidate date to valid, e.g 2014-31-12 -> 31/12 = 2 -> 2014+2=2016 , 31 - 12*2=7 -> 2016-07-12 > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Wenchen Fan > > Today, if we run > {code} > SELECT to_date('2015-07-22', '-dd-MM') > {code} > will result to `2016-10-07`, while running > {code} > SELECT to_date('2014-31-12') # default format > {code} > will return null. > this behavior is weird and we should check other systems like hive to see if > this is expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:11 AM: -- mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null that is mysql both return null when the date is invalidate or the formate is invalidate. and hive will transform the invalidate date to valid, e.g 2014-31-12 -> 31/12 = 2 -> 2014+2=2016 , 31 - 12*2=7 -> 2016-07-12 was (Author: windpiger): mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null mysql both return null when the date is invalidate or the formate is invalidate > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Wenchen Fan > > Today, if we run > {code} > SELECT to_date('2015-07-22', '-dd-MM') > {code} > will result to `2016-10-07`, while running > {code} > SELECT to_date('2014-31-12') # default format > {code} > will return null. > this behavior is weird and we should check other systems like hive to see if > this is expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:09 AM: -- mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null mysql both return null when the date is invalidate or the formate is invalidate was (Author: windpiger): mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return null > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Wenchen Fan > > Today, if we run > {code} > SELECT to_date('2015-07-22', '-dd-MM') > {code} > will result to `2016-10-07`, while running > {code} > SELECT to_date('2014-31-12') # default format > {code} > will return null. > this behavior is weird and we should check other systems like hive to see if > this is expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', '-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. {code} {code} postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', '-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31'); ERROR 1582 (42000): Incorrect parameter count in the call to native function 'str_to_date' {code} {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. was (Author: hyukjin.kwon): - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', '-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. {code} {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', '-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31'); ERROR 1582 (42000): Incorrect parameter count in the call to native function 'str_to_date' {code} {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL:
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', '-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. {code} {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', '-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31'); ERROR 1582 (42000): Incorrect parameter count in the call to native function 'str_to_date' {code} {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. was (Author: hyukjin.kwon): - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', '-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', '-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >
[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:12 AM: -- - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', '-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', '-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. was (Author: hyukjin.kwon): - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', '-MM-dd'); to_date 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', '-MM-dd'); to_date 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---+ | 2014-12-31| +---+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---+ | NULL | +---+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. Hive/Postgres - it seems returning calculated dates. > to_date with format has weird behavior > -- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Wenchen Fan > > Today, if we run > {code} > SELECT to_date('2015-07-22', '-dd-MM') > {code} > will result to `2016-10-07`, while running > {code} > SELECT to_date('2014-31-12') # default format > {code} > will return null. > this behavior is weird and we should check other systems like hive to see if > this is expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org