[
https://issues.apache.org/jira/browse/SPARK-28288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884259#comment-16884259
]
YoungGyu Chun edited comment on SPARK-28288 at 7/13/19 1:21 AM:
----------------------------------------------------------------
Hello [~hyukjin.kwon],
The following is the results of 'get diff'. As you can see there are some
errors - "cannot resolve". Do we need to file a JIRA or there is something
wrong with it?
{code:sql}
diff --git a/sql/core/src/test/resources/sql-tests/results/window.sql.out
b/sql/core/src/test/resources/sql-tests/results/udf/udf-window.sql.out
index 367dc4f513..43093bd05b 100644
--- a/sql/core/src/test/resources/sql-tests/results/window.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/udf/udf-window.sql.out
@@ -21,74 +21,74 @@ struct<>
-- !query 1
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val ROWS CURRENT
ROW) FROM testData
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val ROWS
CURRENT ROW) FROM testData
ORDER BY cate, val
-- !query 1 schema
-struct<val:int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY val ASC
NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>
+struct<udf(val):string,cate:string,count(val) OVER (PARTITION BY cate ORDER BY
val ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>
-- !query 1 output
-NULL NULL 0
-3 NULL 1
-NULL a 0
-1 a 1
-1 a 1
-2 a 1
-1 b 1
-2 b 1
-3 b 1
+nan NULL 0
+3.0 NULL 1
+nan a 0
+1.0 a 1
+1.0 a 1
+2.0 a 1
+1.0 b 1
+2.0 b 1
+3.0 b 1
-- !query 2
-SELECT val, cate, sum(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, sum(val) OVER(PARTITION BY cate ORDER BY val
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING) FROM testData ORDER BY cate,
val
-- !query 2 schema
-struct<val:int,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY val ASC
NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING):bigint>
+struct<udf(val):string,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY
val ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING):bigint>
-- !query 2 output
-NULL NULL 3
-3 NULL 3
-NULL a 1
-1 a 2
-1 a 4
-2 a 4
-1 b 3
-2 b 6
-3 b 6
+nan NULL 3
+3.0 NULL 3
+nan a 1
+1.0 a 2
+1.0 a 4
+2.0 a 4
+1.0 b 3
+2.0 b 6
+3.0 b 6
-- !query 3
-SELECT val_long, cate, sum(val_long) OVER(PARTITION BY cate ORDER BY val_long
+SELECT val_long, cate, udf(sum(val_long)) OVER(PARTITION BY cate ORDER BY
val_long
ROWS BETWEEN CURRENT ROW AND 2147483648 FOLLOWING) FROM testData ORDER BY
cate, val_long
-- !query 3 schema
struct<>
-- !query 3 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'ROWS BETWEEN CURRENT ROW AND 2147483648L FOLLOWING' due to
data type mismatch: The data type of the upper bound 'bigint' does not match
the expected data type 'int'.; line 1 pos 41
+cannot resolve 'ROWS BETWEEN CURRENT ROW AND 2147483648L FOLLOWING' due to
data type mismatch: The data type of the upper bound 'bigint' does not match
the expected data type 'int'.; line 1 pos 46
-- !query 4
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val RANGE 1
PRECEDING) FROM testData
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val RANGE 1
PRECEDING) FROM testData
ORDER BY cate, val
-- !query 4 schema
-struct<val:int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY val ASC
NULLS FIRST RANGE BETWEEN 1 PRECEDING AND CURRENT ROW):bigint>
+struct<udf(val):string,cate:string,count(val) OVER (PARTITION BY cate ORDER BY
val ASC NULLS FIRST RANGE BETWEEN 1 PRECEDING AND CURRENT ROW):bigint>
-- !query 4 output
-NULL NULL 0
-3 NULL 1
-NULL a 0
-1 a 2
-1 a 2
-2 a 3
-1 b 1
-2 b 2
-3 b 2
+nan NULL 0
+3.0 NULL 1
+nan a 0
+1.0 a 2
+1.0 a 2
+2.0 a 3
+1.0 b 1
+2.0 b 2
+3.0 b 2
-- !query 5
-SELECT val, cate, sum(val) OVER(PARTITION BY cate ORDER BY val
+SELECT val, udf(cate), sum(val) OVER(PARTITION BY cate ORDER BY val
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 5 schema
-struct<val:int,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY val ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING):bigint>
+struct<val:int,udf(cate):string,sum(val) OVER (PARTITION BY cate ORDER BY val
ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING):bigint>
-- !query 5 output
-NULL NULL NULL
-3 NULL 3
+NULL None NULL
+3 None 3
NULL a NULL
1 a 4
1 a 4
@@ -99,13 +99,13 @@ NULL a NULL
-- !query 6
-SELECT val_long, cate, sum(val_long) OVER(PARTITION BY cate ORDER BY val_long
+SELECT val_long, udf(cate), sum(val_long) OVER(PARTITION BY cate ORDER BY
val_long
RANGE BETWEEN CURRENT ROW AND 2147483648 FOLLOWING) FROM testData ORDER BY
cate, val_long
-- !query 6 schema
-struct<val_long:bigint,cate:string,sum(val_long) OVER (PARTITION BY cate ORDER
BY val_long ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 2147483648
FOLLOWING):bigint>
+struct<val_long:bigint,udf(cate):string,sum(val_long) OVER (PARTITION BY cate
ORDER BY val_long ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 2147483648
FOLLOWING):bigint>
-- !query 6 output
-NULL NULL NULL
-1 NULL 1
+NULL None NULL
+1 None 1
1 a 4
1 a 4
2 a 2147483652
@@ -116,13 +116,13 @@ NULL b NULL
-- !query 7
-SELECT val_double, cate, sum(val_double) OVER(PARTITION BY cate ORDER BY
val_double
+SELECT val_double, udf(cate), sum(val_double) OVER(PARTITION BY cate ORDER BY
val_double
RANGE BETWEEN CURRENT ROW AND 2.5 FOLLOWING) FROM testData ORDER BY cate,
val_double
-- !query 7 schema
-struct<val_double:double,cate:string,sum(val_double) OVER (PARTITION BY cate
ORDER BY val_double ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND CAST(2.5 AS
DOUBLE) FOLLOWING):double>
+struct<val_double:double,udf(cate):string,sum(val_double) OVER (PARTITION BY
cate ORDER BY val_double ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND CAST(2.5
AS DOUBLE) FOLLOWING):double>
-- !query 7 output
-NULL NULL NULL
-1.0 NULL 1.0
+NULL None NULL
+1.0 None 1.0
1.0 a 4.5
1.0 a 4.5
2.5 a 2.5
@@ -133,13 +133,13 @@ NULL NULL NULL
-- !query 8
-SELECT val_date, cate, max(val_date) OVER(PARTITION BY cate ORDER BY val_date
+SELECT val_date, udf(cate), max(val_date) OVER(PARTITION BY cate ORDER BY
val_date
RANGE BETWEEN CURRENT ROW AND 2 FOLLOWING) FROM testData ORDER BY cate,
val_date
-- !query 8 schema
-struct<val_date:date,cate:string,max(val_date) OVER (PARTITION BY cate ORDER
BY val_date ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 2 FOLLOWING):date>
+struct<val_date:date,udf(cate):string,max(val_date) OVER (PARTITION BY cate
ORDER BY val_date ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 2
FOLLOWING):date>
-- !query 8 output
-NULL NULL NULL
-2017-08-01 NULL 2017-08-01
+NULL None NULL
+2017-08-01 None 2017-08-01
2017-08-01 a 2017-08-02
2017-08-01 a 2017-08-02
2017-08-02 a 2017-08-02
@@ -150,14 +150,14 @@ NULL NULL NULL
-- !query 9
-SELECT val_timestamp, cate, avg(val_timestamp) OVER(PARTITION BY cate ORDER BY
val_timestamp
+SELECT val_timestamp, udf(cate), avg(val_timestamp) OVER(PARTITION BY cate
ORDER BY val_timestamp
RANGE BETWEEN CURRENT ROW AND interval 23 days 4 hours FOLLOWING) FROM testData
ORDER BY cate, val_timestamp
-- !query 9 schema
-struct<val_timestamp:timestamp,cate:string,avg(CAST(val_timestamp AS DOUBLE))
OVER (PARTITION BY cate ORDER BY val_timestamp ASC NULLS FIRST RANGE BETWEEN
CURRENT ROW AND interval 3 weeks 2 days 4 hours FOLLOWING):double>
+struct<val_timestamp:timestamp,udf(cate):string,avg(CAST(val_timestamp AS
DOUBLE)) OVER (PARTITION BY cate ORDER BY val_timestamp ASC NULLS FIRST RANGE
BETWEEN CURRENT ROW AND interval 3 weeks 2 days 4 hours FOLLOWING):double>
-- !query 9 output
-NULL NULL NULL
-2017-07-31 17:00:00 NULL 1.5015456E9
+NULL None NULL
+2017-07-31 17:00:00 None 1.5015456E9
2017-07-31 17:00:00 a 1.5016970666666667E9
2017-07-31 17:00:00 a 1.5016970666666667E9
2017-08-05 23:13:20 a 1.502E9
@@ -168,74 +168,74 @@ NULL NULL NULL
-- !query 10
-SELECT val, cate, sum(val) OVER(PARTITION BY cate ORDER BY val DESC
+SELECT udf(val), cate, sum(val) OVER(PARTITION BY cate ORDER BY val DESC
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 10 schema
-struct<val:int,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY val DESC
NULLS LAST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING):bigint>
+struct<udf(val):string,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY
val DESC NULLS LAST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING):bigint>
-- !query 10 output
-NULL NULL NULL
-3 NULL 3
-NULL a NULL
-1 a 2
-1 a 2
-2 a 4
-1 b 1
-2 b 3
-3 b 5
+nan NULL NULL
+3.0 NULL 3
+nan a NULL
+1.0 a 2
+1.0 a 2
+2.0 a 4
+1.0 b 1
+2.0 b 3
+3.0 b 5
-- !query 11
-SELECT val, cate, count(val) OVER(PARTITION BY cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate
ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING) FROM testData ORDER BY cate,
val
-- !query 11 schema
struct<>
-- !query 11 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING' due to data
type mismatch: Window frame upper bound '1' does not follow the lower bound
'unboundedfollowing$()'.; line 1 pos 33
+cannot resolve 'ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING' due to data
type mismatch: Window frame upper bound '1' does not follow the lower bound
'unboundedfollowing$()'.; line 1 pos 38
-- !query 12
-SELECT val, cate, count(val) OVER(PARTITION BY cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 12 schema
struct<>
-- !query 12 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame cannot be used in
an unordered window specification.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame cannot be used in
an unordered window specification.; line 1 pos 38
-- !query 13
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val, cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val, cate
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 13 schema
struct<>
-- !query 13 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY testdata.`val` ASC
NULLS FIRST, testdata.`cate` ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame with value
boundaries cannot be used in a window specification with multiple order by
expressions: val#x ASC NULLS FIRST,cate#x ASC NULLS FIRST; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY testdata.`val` ASC
NULLS FIRST, testdata.`cate` ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame with value
boundaries cannot be used in a window specification with multiple order by
expressions: val#x ASC NULLS FIRST,cate#x ASC NULLS FIRST; line 1 pos 38
-- !query 14
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY current_timestamp
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY
current_timestamp
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 14 schema
struct<>
-- !query 14 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING)' due to data type
mismatch: The data type 'timestamp' used in the order specification does not
match the data type 'int' which is used in the range frame.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING)' due to data type
mismatch: The data type 'timestamp' used in the order specification does not
match the data type 'int' which is used in the range frame.; line 1 pos 38
-- !query 15
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val
RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING) FROM testData ORDER BY cate, val
-- !query 15 schema
-- !query 14 schema
struct<>
-- !query 14 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AN
D 1 FOLLOWING)' due to data type mismatch: The data type 'timestamp' used in
the order specification does not match the
data type 'int' which is used in the range frame.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AN
D 1 FOLLOWING)' due to data type mismatch: The data type 'timestamp' used in
the order specification does not match the
data type 'int' which is used in the range frame.; line 1 pos 38
-- !query 15
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val
RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING) FROM testData ORDER BY cate, val
-- !query 15 schema
struct<>
-- !query 15 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING' due to data type
mismatch: The lower bound of a window frame
must be less than or equal to the upper bound; line 1 pos 33
+cannot resolve 'RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING' due to data type
mismatch: The lower bound of a window frame
must be less than or equal to the upper bound; line 1 pos 38
-- !query 16
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val
{code}
was (Author: younggyuchun):
Hello [~hyukjin.kwon],
The following is the results of 'get diff'. As you can see there are some
errors - "cannot resolve". Do we need to file a JIRA or there is something
wrong with it?
{code:sql}
-- !query 3
-SELECT val_long, cate, sum(val_long) OVER(PARTITION BY cate ORDER BY val_long
+SELECT val_long, cate, udf(sum(val_long)) OVER(PARTITION BY cate ORDER BY
val_long
ROWS BETWEEN CURRENT ROW AND 2147483648 FOLLOWING) FROM testData ORDER BY
cate, val_long
-- !query 3 schema
struct<>
-- !query 3 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'ROWS BETWEEN CURRENT ROW AND 2147483648L FOLLOWING' due to
data type mismatch: The data type of the upper bound 'bigint' does not match
the expected data type 'int'.; line 1 pos 41
+cannot resolve 'ROWS BETWEEN CURRENT ROW AND 2147483648L FOLLOWING' due to
data type mismatch: The data type of the upper bound 'bigint' does not match
the expected data type 'int'.; line 1 pos 46
-- !query 11
-SELECT val, cate, count(val) OVER(PARTITION BY cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate
ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING) FROM testData ORDER BY cate,
val
-- !query 11 schema
struct<>
-- !query 11 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING' due to data
type mismatch: Window frame upper bound '1' does not follow the lower bound
'unboundedfollowing$()'.; line 1 pos 33
+cannot resolve 'ROWS BETWEEN UNBOUNDED FOLLOWING AND 1 FOLLOWING' due to data
type mismatch: Window frame upper bound '1' does not follow the lower bound
'unboundedfollowing$()'.; line 1 pos 38
-- !query 12
-SELECT val, cate, count(val) OVER(PARTITION BY cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 12 schema
struct<>
-- !query 12 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame cannot be used in
an unordered window specification.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame cannot be used in
an unordered window specification.; line 1 pos 38
-- !query 13
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val, cate
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val, cate
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 13 schema
struct<>
-- !query 13 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY testdata.`val` ASC
NULLS FIRST, testdata.`cate` ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame with value
boundaries cannot be used in a window specification with multiple order by
expressions: val#x ASC NULLS FIRST,cate#x ASC NULLS FIRST; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY testdata.`val` ASC
NULLS FIRST, testdata.`cate` ASC NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1
FOLLOWING)' due to data type mismatch: A range window frame with value
boundaries cannot be used in a window specification with multiple order by
expressions: val#x ASC NULLS FIRST,cate#x ASC NULLS FIRST; line 1 pos 38
-- !query 14
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY current_timestamp
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY
current_timestamp
RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM testData ORDER BY cate, val
-- !query 14 schema
struct<>
-- !query 14 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING)' due to data type
mismatch: The data type 'timestamp' used in the order specification does not
match the data type 'int' which is used in the range frame.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING)' due to data type
mismatch: The data type 'timestamp' used in the order specification does not
match the data type 'int' which is used in the range frame.; line 1 pos 38
-- !query 15
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val
RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING) FROM testData ORDER BY cate, val
-- !query 15 schema
-- !query 14 schema
struct<>
-- !query 14 output
org.apache.spark.sql.AnalysisException
-cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AN
D 1 FOLLOWING)' due to data type mismatch: The data type 'timestamp' used in
the order specification does not match the
data type 'int' which is used in the range frame.; line 1 pos 33
+cannot resolve '(PARTITION BY testdata.`cate` ORDER BY current_timestamp() ASC
NULLS FIRST RANGE BETWEEN CURRENT ROW AN
D 1 FOLLOWING)' due to data type mismatch: The data type 'timestamp' used in
the order specification does not match the
data type 'int' which is used in the range frame.; line 1 pos 38
-- !query 15
-SELECT val, cate, count(val) OVER(PARTITION BY cate ORDER BY val
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val
RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING) FROM testData ORDER BY cate, val
-- !query 15 schema
struct<>
-- !query 15 output
org.apache.spark.sql.AnalysisException
-cannot resolve 'RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING' due to data type
mismatch: The lower bound of a window frame
must be less than or equal to the upper bound; line 1 pos 33
+cannot resolve 'RANGE BETWEEN 1 FOLLOWING AND 1 PRECEDING' due to data type
mismatch: The lower bound of a window frame
must be less than or equal to the upper bound; line 1 pos 38
{code}
> Convert and port 'window.sql' into UDF test base
> ------------------------------------------------
>
> Key: SPARK-28288
> URL: https://issues.apache.org/jira/browse/SPARK-28288
> Project: Spark
> Issue Type: Sub-task
> Components: PySpark, SQL, Tests
> Affects Versions: 3.0.0
> Reporter: Hyukjin Kwon
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]