[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r321583353 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -186,23 +186,23 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- FROM TIMESTAMP_TBL -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; - --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, ---date_part( 'usec', d1) AS usec ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'isoyear', d1) AS isoyear, date_part( 'week', d1) AS week, ---date_part( 'dow', d1) AS dow ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +-- [SPARK-28767] ParseException: no viable alternative at input 'year' +set spark.sql.parser.ansi.enabled=false; Review comment: In pgSQL, `year` is not reserved, so we can use it as an alias name. https://www.postgresql.org/docs/11/sql-keywords-appendix.html Even if its reserved, we can use it though; ``` postgres=# select 1 as year; year -- 1 (1 row) postgres=# create table year(t int); CREATE TABLE postgres=# select 1 as select; select 1 (1 row) postgres=# create table select(t int); 2019-09-06 14:44:35.490 JST [6166] ERROR: syntax error at or near "select" at character 14 2019-09-06 14:44:35.490 JST [6166] STATEMENT: create table select(t int); ERROR: syntax error at or near "select" LINE 1: create table select(t int); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r321524590 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -186,23 +186,23 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- FROM TIMESTAMP_TBL -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; - --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, ---date_part( 'usec', d1) AS usec ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'isoyear', d1) AS isoyear, date_part( 'week', d1) AS week, ---date_part( 'dow', d1) AS dow ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +-- [SPARK-28767] ParseException: no viable alternative at input 'year' +set spark.sql.parser.ansi.enabled=false; Review comment: yea, quoting looks ok to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r321274467 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -186,23 +186,23 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- FROM TIMESTAMP_TBL -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; - --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, ---date_part( 'usec', d1) AS usec ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'isoyear', d1) AS isoyear, date_part( 'week', d1) AS week, ---date_part( 'dow', d1) AS dow ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +-- [SPARK-28767] ParseException: no viable alternative at input 'year' +set spark.sql.parser.ansi.enabled=false; Review comment: To use `year` as an alias name in t[he query](https://github.com/apache/spark/pull/25410/files#diff-e0d0d3afa458470e63a363b80e00e2c4R192) below, it just turns off the ansi mode temporarily; `year` cannot be used as an alias name with ansi=true because that is a reserved keyword: https://github.com/apache/spark/pull/25410/files#r314599685 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r316939605 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -187,22 +187,21 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, ---date_part( 'usec', d1) AS usec ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'isoyear', d1) AS isoyear, date_part( 'week', d1) AS week, ---date_part( 'dow', d1) AS dow ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +SELECT '' AS `54`, d1 as `timestamp`, Review comment: Have you checked that? https://github.com/apache/spark/pull/25114#issuecomment-510537932 I think that's an expected behaviour in the ansi mode. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r316939605 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -187,22 +187,21 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, ---date_part( 'usec', d1) AS usec ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; - --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'isoyear', d1) AS isoyear, date_part( 'week', d1) AS week, ---date_part( 'dow', d1) AS dow ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +SELECT '' AS `54`, d1 as `timestamp`, Review comment: Have you checked that? https://github.com/apache/spark/pull/25114#issuecomment-510537932 I think that's an expected behaviour. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314962673 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: Ur, I don't think the approach is bad and I'm just looking for a better solution to keep the current error handling. If we already have the similar logic, can we follow that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314962673 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: Ur, I don't think the approach is bad and I'm just looking for a better solution to keep the current error handling for that. If we already have the similar logic, can we follow that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314657446 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: I've heard that its a little hard to read parser error messages in spark. So, I personally think we should throw ParseException with ctx for the `extract` case as much as possible. Then, how about writing it like this? https://github.com/apache/spark/commit/d793f4958f3a9adc69f7a1e04f7fc7a994d5e15a This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314601085 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: Ur, I see! sorry, but I need to be away from a keybord, so I'll recheck in hours. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314596786 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: Instead of passing ctx into DataPart, can't we resolve `extractField` in `AstBuilder` then pass the resolved into DataPart? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314582126 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,77 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "MILLENNIA", "MIL", "MILS", + "CENTURY", "CENTURIES", "C", "CENT", + "DECADE", "DECADES", "DEC", "DECS", + "YEAR", "Y", "YEARS", "YR", "YRS", + "ISOYEAR", + "QUARTER", "QTR", + "MONTH", "MON", "MONS", "MONTHS", + "WEEK", "W", "WEEKS", + "DAY", "D", "DAYS", + "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "H", "HOURS", "HR", "HRS", + "MINUTE", "M", "MIN", "MINS", "MINUTES", + "SECOND", "S", "SEC", "SECONDS", "SECS", + "MILLISECONDS", "MSEC", "MSECS", "MILLISECON", "MSECONDS", "MS", + "MICROSECONDS", "USEC", "USECS", "USECONDS", "MICROSECON", "US", + "EPOCH"] Review comment: How about using parentheses for these alternatives like this? ``` * field - selects which part of the source should be extracted. Supported string values are: ["MILLENNIUM" ("MILLENNIA", "MIL", "MILS"), "CENTURY" ("CENTURIES", "C", "CENT"), "DECADE" ("DECADES", "DEC", "DECS"), "YEAR" ("Y", "YEARS", "YR", "YRS"), "ISOYEAR", "QUARTER" ("QTR"), "MONTH" ("MON", "MONS", "MONTHS"), "WEEK" ("W", "WEEKS"), "DAY" ("D", "DAYS"), "DAYOFWEEK" ("DOW", "ISODOW", "DOY"), "HOUR" ("H", "HOURS", "HR", "HRS"), "MINUTE" ("M", "MIN", "MINS", "MINUTES"), "SECOND" ("S", "SEC", "SECONDS", "SECS"), "MILLISECONDS" ("MSEC", "MSECS", "MILLISECON", "MSECONDS", "MS"), "MICROSECONDS" ("USEC", "USECS", "USECONDS", "MICROSECON", "US"), "EPOCH"] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314579663 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1409,48 +1409,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging * Create a Extract expression. */ override def visitExtract(ctx: ExtractContext): Expression = withOrigin(ctx) { -ctx.field.getText.toUpperCase(Locale.ROOT) match { - case "MILLENNIUM" | "MILLENNIA" | "MIL" | "MILS" => -Millennium(expression(ctx.source)) - case "CENTURY" | "CENTURIES" | "C" | "CENT" => -Century(expression(ctx.source)) - case "DECADE" | "DECADES" | "DEC" | "DECS" => -Decade(expression(ctx.source)) - case "YEAR" | "Y" | "YEARS" | "YR" | "YRS" => -Year(expression(ctx.source)) - case "ISOYEAR" => -IsoYear(expression(ctx.source)) - case "QUARTER" | "QTR" => -Quarter(expression(ctx.source)) - case "MONTH" | "MON" | "MONS" | "MONTHS" => -Month(expression(ctx.source)) - case "WEEK" | "W" | "WEEKS" => -WeekOfYear(expression(ctx.source)) - case "DAY" | "D" | "DAYS" => -DayOfMonth(expression(ctx.source)) - case "DAYOFWEEK" => -DayOfWeek(expression(ctx.source)) - case "DOW" => -Subtract(DayOfWeek(expression(ctx.source)), Literal(1)) - case "ISODOW" => -Add(WeekDay(expression(ctx.source)), Literal(1)) - case "DOY" => -DayOfYear(expression(ctx.source)) - case "HOUR" | "H" | "HOURS" | "HR" | "HRS" => -Hour(expression(ctx.source)) - case "MINUTE" | "M" | "MIN" | "MINS" | "MINUTES" => -Minute(expression(ctx.source)) - case "SECOND" | "S" | "SEC" | "SECONDS" | "SECS" => -Second(expression(ctx.source)) - case "MILLISECONDS" | "MSEC" | "MSECS" | "MILLISECON" | "MSECONDS" | "MS" => -Milliseconds(expression(ctx.source)) - case "MICROSECONDS" | "USEC" | "USECS" | "USECONDS" | "MICROSECON" | "US" => -Microseconds(expression(ctx.source)) - case "EPOCH" => -Epoch(expression(ctx.source)) - case other => -throw new ParseException(s"Literals of type '$other' are currently not supported.", ctx) Review comment: The exception changed from `ParseException` to `AnalysisException`? Can we keep the current behaviour? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314198550 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH", + "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "MINUTE", "SECOND"] + * source - a date (or timestamp) column from where `field` should be extracted + """, + examples = """ +Examples: + > SELECT _FUNC_('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456'); + 2019 + > SELECT _FUNC_('week', timestamp'2019-08-12 01:00:00.123456'); + 33 + > SELECT _FUNC_('doy', DATE'2019-08-12'); + 224 + """, + since = "3.0.0") +case class DatePart(field: Expression, source: Expression, child: Expression) + extends RuntimeReplaceable { + + def this(field: Expression, source: Expression) { +this(field, source, { + if (!field.foldable) { +throw new AnalysisException("The field parameter needs to be a foldable string value.") Review comment: I checked the mysql behaivour and it doesn't support the case. So, its ok to keep as it is. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314139193 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH", + "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "MINUTE", "SECOND"] Review comment: We don't need to list up all the alternatives somewhere for users? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
maropu commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314138145 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH", + "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "MINUTE", "SECOND"] + * source - a date (or timestamp) column from where `field` should be extracted + """, + examples = """ +Examples: + > SELECT _FUNC_('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456'); + 2019 + > SELECT _FUNC_('week', timestamp'2019-08-12 01:00:00.123456'); + 33 + > SELECT _FUNC_('doy', DATE'2019-08-12'); + 224 + """, + since = "3.0.0") +case class DatePart(field: Expression, source: Expression, child: Expression) + extends RuntimeReplaceable { + + def this(field: Expression, source: Expression) { +this(field, source, { + if (!field.foldable) { +throw new AnalysisException("The field parameter needs to be a foldable string value.") Review comment: This follows the standard? It seems pgSQL accept a non-foldable value; ``` postgres=# create table t (a text, b timestamp); CREATE TABLE postgres=# insert into t values ('year', '2019-08-12 01:00:00.123456'); INSERT 0 1 postgres=# select * from t; a | b --+ year | 2019-08-12 01:00:00.123456 (1 row) postgres=# select date_part(a, b) from t; date_part --- 2019 (1 row) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org