[jira] [Commented] (SPARK-28506) not handling usage of group function and window function at some conditions

2020-11-30 Thread Dylan Guedes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240650#comment-17240650
 ] 

Dylan Guedes commented on SPARK-28506:
--

I took this based on their golden files; maybe they changed the behavior in 
recent versions.

If you ran both queries in PgSQL and in SparkSQL and the output was the same 
you can close this just fine I think.

> not handling usage of group function and window function at some conditions
> ---
>
> Key: SPARK-28506
> URL: https://issues.apache.org/jira/browse/SPARK-28506
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Hi,
> looks like SparkSQL is not able to handle this query:
> {code:sql}SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GROUP BY 
> 1;{code}
> PgSQL, on the other hand, does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29638) Spark handles 'NaN' as 0 in sums

2020-11-30 Thread Dylan Guedes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240649#comment-17240649
 ] 

Dylan Guedes commented on SPARK-29638:
--

Hmmm I wasn't expecting this :(
I took that PgSQL would return `NaN` by reading their outputs for the golden 
tests. Maybe they changed this in recent versions? oO

Anyway, if Spark is summing `NaN` as zero that stills inconsistent with PgSql.

> Spark handles 'NaN' as 0 in sums
> 
>
> Key: SPARK-29638
> URL: https://issues.apache.org/jira/browse/SPARK-29638
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. 
> PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 
> 'NaN'
> I experienced this with the query below:
> {code:sql}
> SELECT a, b,
>SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
> FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28064) Order by does not accept a call to rank()

2020-11-25 Thread Dylan Guedes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238662#comment-17238662
 ] 

Dylan Guedes commented on SPARK-28064:
--

Sorry, my only intention was to help to map the differences between PostgreSQL 
and SparkSQL APIs.

> Order by does not accept a call to rank()
> -
>
> Key: SPARK-28064
> URL: https://issues.apache.org/jira/browse/SPARK-28064
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently in Spark, we can't use a call to `rank()` in a order by; we need to 
> first rename the rank column to, for instance, `r` and then, use `order by 
> r`. For example:
>  This does not work:
> {code:sql}
>  SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS 
> (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
> {code}
> However, this one does:
> {code:sql}
>  SELECT depname, empno, salary, rank() OVER w as r FROM empsalary WINDOW w AS 
> (PARTITION BY depname ORDER BY salary) ORDER BY r;
> {code}
> By the way, I took this one from Postgres behavior: postgres accept both ways.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31539) Backport SPARK-27138 Remove AdminUtils calls (fixes deprecation)

2020-04-24 Thread Dylan Guedes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091705#comment-17091705
 ] 

Dylan Guedes commented on SPARK-31539:
--

Agreed, I think it is not worth it.

> Backport SPARK-27138   Remove AdminUtils calls (fixes deprecation)
> --
>
> Key: SPARK-31539
> URL: https://issues.apache.org/jira/browse/SPARK-31539
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.4.6
>Reporter: Holden Karau
>Priority: Major
>
> SPARK-27138       Remove AdminUtils calls (fixes deprecation)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29638) Spark handles 'NaN' as 0 in sums

2019-10-29 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-29638:
-
Description: 
Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. 
PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 
'NaN'

I experienced this with the query below:

{code:sql}
SELECT a, b,
   SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b);
{code}


  was:Currently, Spark handles 'NaN' as 0 in window functions, such that 
3+'NaN'=3. PgSQL, on the other hand, handles the entire result as 'NaN', as in 
3+'NaN' = 'NaN'


> Spark handles 'NaN' as 0 in sums
> 
>
> Key: SPARK-29638
> URL: https://issues.apache.org/jira/browse/SPARK-29638
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. 
> PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 
> 'NaN'
> I experienced this with the query below:
> {code:sql}
> SELECT a, b,
>SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
> FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29638) Spark handles 'NaN' as 0 in sums

2019-10-29 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29638:


 Summary: Spark handles 'NaN' as 0 in sums
 Key: SPARK-29638
 URL: https://issues.apache.org/jira/browse/SPARK-29638
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. 
PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 
'NaN'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29636) Can't parse '11:00 BST' or '2000-10-19 10:23:54+01' signatures to timestamp

2019-10-29 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29636:


 Summary: Can't parse '11:00 BST' or '2000-10-19 10:23:54+01' 
signatures to timestamp
 Key: SPARK-29636
 URL: https://issues.apache.org/jira/browse/SPARK-29636
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark can't parse a string such as '11:00 BST' or '2000-10-19 
10:23:54+01' to timestamp:

{code:sql}
spark-sql> select cast ('11:00 BST' as timestamp);
NULL
Time taken: 2.248 seconds, Fetched 1 row(s)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29540) Thrift in some cases can't parse string to date

2019-10-21 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29540:


 Summary: Thrift in some cases can't parse string to date
 Key: SPARK-29540
 URL: https://issues.apache.org/jira/browse/SPARK-29540
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes


I'm porting tests from PostgreSQL window.sql but anything related to casting a 
string to datetime seems to fail on Thrift. For instance, the following does 
not work:

{code:sql}
CREATE TABLE empsalary (

depname string, 

empno integer,  

salary int, 

enroll_date date

) USING parquet;  
INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01');
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-29107) Add window.sql - Part 1

2019-10-19 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes reopened SPARK-29107:
--

We reverted the initial PR.

> Add window.sql - Part 1
> ---
>
> Key: SPARK-29107
> URL: https://issues.apache.org/jira/browse/SPARK-29107
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Assignee: Dylan Guedes
>Priority: Major
> Fix For: 3.0.0
>
>
> In this ticket, we plan to add the regression test cases of 
> https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29451) Some queries with divisions in windows are failling in Thrift

2019-10-13 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29451:


 Summary: Some queries with divisions in windows are failling in 
Thrift
 Key: SPARK-29451
 URL: https://issues.apache.org/jira/browse/SPARK-29451
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Hello,
the following queries are not properly working on Thrift. The only difference 
between them and some other queries that works fine are the numeric divisions, 
I think.
{code:sql}
SELECT four, ten/4 as two,
sum(ten/4) over (partition by four order by ten/4 rows between unbounded 
preceding and current row),
last(ten/4) over (partition by four order by ten/4 rows between unbounded 
preceding and current row)
FROM (select distinct ten, four from tenk1) ss;
{code}

{code:sql}
SELECT four, ten/4 as two,
sum(ten/4) over (partition by four order by ten/4 range between unbounded 
preceding and current row),
last(ten/4) over (partition by four order by ten/4 range between unbounded 
preceding and current row)
FROM (select distinct ten, four from tenk1) ss;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29451) Some queries with divisions in SQL windows are failling in Thrift

2019-10-13 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-29451:
-
Summary: Some queries with divisions in SQL windows are failling in Thrift  
(was: Some queries with divisions in windows are failling in Thrift)

> Some queries with divisions in SQL windows are failling in Thrift
> -
>
> Key: SPARK-29451
> URL: https://issues.apache.org/jira/browse/SPARK-29451
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Hello,
> the following queries are not properly working on Thrift. The only difference 
> between them and some other queries that works fine are the numeric 
> divisions, I think.
> {code:sql}
> SELECT four, ten/4 as two,
> sum(ten/4) over (partition by four order by ten/4 rows between unbounded 
> preceding and current row),
> last(ten/4) over (partition by four order by ten/4 rows between unbounded 
> preceding and current row)
> FROM (select distinct ten, four from tenk1) ss;
> {code}
> {code:sql}
> SELECT four, ten/4 as two,
> sum(ten/4) over (partition by four order by ten/4 range between unbounded 
> preceding and current row),
> last(ten/4) over (partition by four order by ten/4 range between unbounded 
> preceding and current row)
> FROM (select distinct ten, four from tenk1) ss;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29108) Add window.sql - Part 2

2019-09-16 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-29108:
-
Description: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562]
  (was: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319])

> Add window.sql - Part 2
> ---
>
> Key: SPARK-29108
> URL: https://issues.apache.org/jira/browse/SPARK-29108
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
> Fix For: 3.0.0
>
>
> In this ticket, we plan to add the regression test cases of 
> [https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29109) Add window.sql - Part 3

2019-09-16 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-29109:
-
Description: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911]
  (was: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319])

> Add window.sql - Part 3
> ---
>
> Key: SPARK-29109
> URL: https://issues.apache.org/jira/browse/SPARK-29109
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
> Fix For: 3.0.0
>
>
> In this ticket, we plan to add the regression test cases of 
> [https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29110) Add window.sql - Part 4

2019-09-16 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29110:


 Summary: Add window.sql - Part 4
 Key: SPARK-29110
 URL: https://issues.apache.org/jira/browse/SPARK-29110
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes
 Fix For: 3.0.0


In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29110) Add window.sql - Part 4

2019-09-16 Thread Dylan Guedes (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-29110:
-
Description: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259]
  (was: In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319])

> Add window.sql - Part 4
> ---
>
> Key: SPARK-29110
> URL: https://issues.apache.org/jira/browse/SPARK-29110
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
> Fix For: 3.0.0
>
>
> In this ticket, we plan to add the regression test cases of 
> [https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L912-L1259]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29109) Add window.sql - Part 3

2019-09-16 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29109:


 Summary: Add window.sql - Part 3
 Key: SPARK-29109
 URL: https://issues.apache.org/jira/browse/SPARK-29109
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes
 Fix For: 3.0.0


In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L553-L911|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29108) Add window.sql - Part 2

2019-09-16 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29108:


 Summary: Add window.sql - Part 2
 Key: SPARK-29108
 URL: https://issues.apache.org/jira/browse/SPARK-29108
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes
 Fix For: 3.0.0


In this ticket, we plan to add the regression test cases of 
[https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L320-L562|https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29107) Add window.sql - Part 1

2019-09-16 Thread Dylan Guedes (Jira)
Dylan Guedes created SPARK-29107:


 Summary: Add window.sql - Part 1
 Key: SPARK-29107
 URL: https://issues.apache.org/jira/browse/SPARK-29107
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes
 Fix For: 3.0.0


In this ticket, we plan to add the regression test cases of 
https://github.com/postgres/postgres/blob/REL_12_BETA3/src/test/regress/sql/window.sql#L1-L319



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28648) Adds support to `groups` unit type in window clauses

2019-08-07 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28648:
-
Summary: Adds support to `groups` unit type in window clauses  (was: Adds 
support to `groups` in window clauses)

> Adds support to `groups` unit type in window clauses
> 
>
> Key: SPARK-28648
> URL: https://issues.apache.org/jira/browse/SPARK-28648
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Spark currently support the two most common window functions unit types: rows 
> and ranges. However, in PgSQL a new type was added: `groups`. 
> According to [this 
> source|https://blog.jooq.org/2018/07/05/postgresql-11s-support-for-sql-standard-groups-and-exclude-window-function-clauses/],
>  the difference is:
> """ROWS counts the exact number of rows in the frame.
> RANGE performs logical windowing where we don’t count the number of rows, but 
> look for a value offset.
> GROUPS counts all groups of tied rows within the window."""



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28648) Adds support to `groups` in window clauses

2019-08-07 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28648:


 Summary: Adds support to `groups` in window clauses
 Key: SPARK-28648
 URL: https://issues.apache.org/jira/browse/SPARK-28648
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Spark currently support the two most common window functions unit types: rows 
and ranges. However, in PgSQL a new type was added: `groups`. 

According to [this 
source|https://blog.jooq.org/2018/07/05/postgresql-11s-support-for-sql-standard-groups-and-exclude-window-function-clauses/],
 the difference is:
"""ROWS counts the exact number of rows in the frame.
RANGE performs logical windowing where we don’t count the number of rows, but 
look for a value offset.
GROUPS counts all groups of tied rows within the window."""




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28646) Allow usage of `count` only for parameterless aggregate function

2019-08-07 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28646:


 Summary: Allow usage of `count` only for parameterless aggregate 
function
 Key: SPARK-28646
 URL: https://issues.apache.org/jira/browse/SPARK-28646
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark allows calls to `count` even for non parameterless aggregate 
function. For example, the following query actually works:
{code:sql}SELECT count() OVER () FROM tenk1;{code}
In PgSQL, on the other hand, the following error is thrown:
{code:sql}ERROR:  count(*) must be used to call a parameterless aggregate 
function{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28645) Throw an error on window redefinition

2019-08-07 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28645:
-
Summary: Throw an error on window redefinition  (was: Block redefinition of 
window)

> Throw an error on window redefinition
> -
>
> Key: SPARK-28645
> URL: https://issues.apache.org/jira/browse/SPARK-28645
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently in Spark one could redefine a window. For instance:
> {code:sql}select count(*) OVER w FROM tenk1 WINDOW w AS (ORDER BY unique1), w 
> AS (ORDER BY unique1);{code}
> The window `w` is defined two times. In PgSQL, on the other hand, a thrown 
> will happen:
> {code:sql}ERROR:  window "w" is already defined{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28645) Block redefinition of window

2019-08-07 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28645:


 Summary: Block redefinition of window
 Key: SPARK-28645
 URL: https://issues.apache.org/jira/browse/SPARK-28645
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently in Spark one could redefine a window. For instance:
{code:sql}select count(*) OVER w FROM tenk1 WINDOW w AS (ORDER BY unique1), w 
AS (ORDER BY unique1);{code}
The window `w` is defined two times. In PgSQL, on the other hand, a thrown will 
happen:
{code:sql}ERROR:  window "w" is already defined{code}

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28602) Recognize interval as a numeric type

2019-08-02 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28602:


 Summary: Recognize interval as a numeric type
 Key: SPARK-28602
 URL: https://issues.apache.org/jira/browse/SPARK-28602
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Hello,
Spark does not recognize `interval` type as a `numeric` one, which means that 
we can't use `interval` columns in aggregated functions. For instance, the 
following query works on PgSQL but does not work on Spark:
{code:sql}SELECT i,AVG(cast(v as interval)) OVER (ORDER BY i ROWS BETWEEN 
CURRENT ROW AND UNBOUNDED FOLLOWING) FROM (VALUES(1,'1 sec'),(2,'2 
sec'),(3,NULL),(4,NULL)) t(i,v);{code}

{code:sql}cannot resolve 'avg(CAST(`v` AS INTERVAL))' due to data type 
mismatch: function average requires numeric types, not interval; line 1 pos 
9{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28566) window functions should not be allowed in window definitions

2019-07-30 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28566:


 Summary: window functions should not be allowed in window 
definitions
 Key: SPARK-28566
 URL: https://issues.apache.org/jira/browse/SPARK-28566
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark allows the usage of window functions inside window 
definitions, such as:
{code:sql}
 SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()));{code}
However, in PgSQL such behavior is now allowed:
{code:sql}
postgres=# SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()));
ERROR:  window functions are not allowed in window definitions
LINE 1: SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random())...{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28553) subqueries must be aggregated before hand

2019-07-30 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes resolved SPARK-28553.
--
Resolution: Duplicate

This is a duplicate of SPARK-28379.

> subqueries must be aggregated before hand
> -
>
> Key: SPARK-28553
> URL: https://issues.apache.org/jira/browse/SPARK-28553
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Looks like Spark subqueries does not work well with variables and values from 
> outside of subquery. For instance, this query work on PgSQL:
>  {code:sql}
> SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
> (PARTITION BY four ORDER BY ten) FROM tenk1 s WHERE unique2 < 10;{code}
> However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28553) subqueries must be aggregated before hand

2019-07-30 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896064#comment-16896064
 ] 

Dylan Guedes commented on SPARK-28553:
--

Oh, my bad. I even look'd for an already created JIRA, idk why I didn't find 
the one that you created (maybe I used the wrong terms). I'll edit my PR to use 
your JIRA Instead and I'll close this one. Thank you [~yumwang]

> subqueries must be aggregated before hand
> -
>
> Key: SPARK-28553
> URL: https://issues.apache.org/jira/browse/SPARK-28553
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Looks like Spark subqueries does not work well with variables and values from 
> outside of subquery. For instance, this query work on PgSQL:
>  {code:sql}
> SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
> (PARTITION BY four ORDER BY ten) FROM tenk1 s WHERE unique2 < 10;{code}
> However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28553) subqueries must be aggregated before hand

2019-07-29 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28553:
-
Summary: subqueries must be aggregated before hand  (was: subqueries always 
must be aggregated before hand)

> subqueries must be aggregated before hand
> -
>
> Key: SPARK-28553
> URL: https://issues.apache.org/jira/browse/SPARK-28553
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Looks like Spark subqueries does not work well with variables and values from 
> outside of subquery. For instance, this query work on PgSQL:
>  {code:sql}
> SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
> (PARTITION BY four ORDER BY ten) FROM tenk1 s WHERE unique2 < 10;{code}
> However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28553) subqueries always must be aggregated before hand

2019-07-29 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28553:
-
Description: 
Looks like Spark subqueries does not work well with variables and values from 
outside of subquery. For instance, this query work on PgSQL:
 {code:sql}
SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
(PARTITION BY four ORDER BY ten) FROM tenk1 s WHERE unique2 < 10;{code}
However, it does not work in Spark.

  was:
Looks like Spark subqueries does not work well with variables and values from 
outside of subquery. For instance, this query work on PgSQL:
 {code:sql}
-- SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
(PARTITION BY four ORDER BY ten)
  -- FROM tenk1 s WHERE unique2 < 10;{code}
However, it does not work in Spark.


> subqueries always must be aggregated before hand
> 
>
> Key: SPARK-28553
> URL: https://issues.apache.org/jira/browse/SPARK-28553
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Looks like Spark subqueries does not work well with variables and values from 
> outside of subquery. For instance, this query work on PgSQL:
>  {code:sql}
> SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
> (PARTITION BY four ORDER BY ten) FROM tenk1 s WHERE unique2 < 10;{code}
> However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28553) subqueries always must be aggregated before hand

2019-07-29 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28553:
-
Summary: subqueries always must be aggregated before hand  (was: subqueries 
always must be aggregated before)

> subqueries always must be aggregated before hand
> 
>
> Key: SPARK-28553
> URL: https://issues.apache.org/jira/browse/SPARK-28553
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Looks like Spark subqueries does not work well with variables and values from 
> outside of subquery. For instance, this query work on PgSQL:
>  {code:sql}
> -- SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
> (PARTITION BY four ORDER BY ten)  
> -- FROM tenk1 s WHERE unique2 < 10;{code}
> However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28553) subqueries always must be aggregated before

2019-07-29 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28553:


 Summary: subqueries always must be aggregated before
 Key: SPARK-28553
 URL: https://issues.apache.org/jira/browse/SPARK-28553
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Looks like Spark subqueries does not work well with variables and values from 
outside of subquery. For instance, this query work on PgSQL:
 {code:sql}
-- SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER 
(PARTITION BY four ORDER BY ten)
  -- FROM tenk1 s WHERE unique2 < 10;{code}
However, it does not work in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28516) adds `to_char`

2019-07-25 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28516:


 Summary: adds `to_char`
 Key: SPARK-28516
 URL: https://issues.apache.org/jira/browse/SPARK-28516
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark does not have support for `to_char`. PgSQL, however, 
[does|[https://www.postgresql.org/docs/9.6/functions-formatting.html]]:

Query example: 

 SELECT to_char(SUM(n) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 
FOLLOWING),'9D9')



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28508) Support for range frame+row frame in the same query

2019-07-24 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28508:


 Summary: Support for range frame+row frame in the same query
 Key: SPARK-28508
 URL: https://issues.apache.org/jira/browse/SPARK-28508
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, looks like some queries does not works if both, a range frame and a 
row frame are given. However, PgSQL is able to handle them:

{code:sql}
select last(salary) over(order by enroll_date range between 1 preceding and 1 
following), lag(salary) over(order by enroll_date range between 1 preceding and 
1 following),
salary, enroll_date from empsalary;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28506) not handling usage of group function and window function at some conditions

2019-07-24 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28506:
-
Summary: not handling usage of group function and window function at some 
conditions  (was: now handling usage of group function and window function at 
some conditions)

> not handling usage of group function and window function at some conditions
> ---
>
> Key: SPARK-28506
> URL: https://issues.apache.org/jira/browse/SPARK-28506
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Hi,
> looks like SparkSQL is not able to handle this query:
> {code:sql}SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GROUP BY 
> 1;{code}
> PgSQL, on the other hand, does.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28506) now handling usage of group function and window function at some conditions

2019-07-24 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28506:


 Summary: now handling usage of group function and window function 
at some conditions
 Key: SPARK-28506
 URL: https://issues.apache.org/jira/browse/SPARK-28506
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Hi,

looks like SparkSQL is not able to handle this query:

{code:sql}SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GROUP BY 
1;{code}

PgSQL, on the other hand, does.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20856) support statement using nested joins

2019-07-24 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-20856:
-
Description: 
While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does not. 
Not supported
{code:sql}
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum
{code}
versus written as shown
{code:sql}
select * from 
  cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
join cert.tbint tbint on tint.rnum = tbint.rnum
{code}   

{code:text}
ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 'LATERAL', 
'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', '<>', '!=', '<', 
LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', '^', 'SORT', 
'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)

== SQL ==
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum
-^^^
, Query: select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum.
SQLState:  HY000
ErrorCode: 500051
{code}


  was:
While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does not. 

Not supported
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum

versus written as shown
select * from 
  cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
join cert.tbint tbint on tint.rnum = tbint.rnum
   


ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 'LATERAL', 
'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', '<>', '!=', '<', 
LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', '^', 'SORT', 
'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)

== SQL ==
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum
-^^^
, Query: select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum.
SQLState:  HY000
ErrorCode: 500051




> support statement using nested joins
> 
>
> Key: SPARK-20856
> URL: https://issues.apache.org/jira/browse/SPARK-20856
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Major
>  Labels: bulk-closed
>
> While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does 
> not. 
> Not supported
> {code:sql}
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> {code}
> versus written as shown
> {code:sql}
> select * from 
>   cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
> join cert.tbint tbint on tint.rnum = tbint.rnum
> {code}   
> {code:text}
> ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
> extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
> 'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
> 'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 
> 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', 
> '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', 
> '^', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)
> == SQL ==
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> -^^^
> , Query: select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum.
> SQLState:  HY000
> ErrorCode: 500051
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28500) adds support for `filter` clause

2019-07-24 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891833#comment-16891833
 ] 

Dylan Guedes commented on SPARK-28500:
--

Hmm you are correct, there's some overlapping there. However, at some point the 
JIRA will be fragmented into smaller ones, like one for filter, other for 
distinct, right?

> adds support for `filter` clause
> 
>
> Key: SPARK-28500
> URL: https://issues.apache.org/jira/browse/SPARK-28500
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Definition: "The {{filter}} clause extends aggregate functions ({{sum}}, 
> {{avg}}, {{count}}, …) by an additional {{where}} clause. The result of the 
> aggregate is built from only the rows that satisfy the additional {{where}} 
> clause too." [source|[https://modern-sql.com/feature/filter]]
> Also, PgSQL currently support `filter` while Spark doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28501) frame bound must be a literal

2019-07-24 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28501:


 Summary: frame bound must be a literal
 Key: SPARK-28501
 URL: https://issues.apache.org/jira/browse/SPARK-28501
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Spark frame bound currently only supports literals:

{code:sql}
SELECT sum(unique1) over  (order by unique1 rows (SELECT unique1 FROM tenk1 
ORDER BY unique1 LIMIT 1) + 1 PRECEDING),  unique1 FROM tenk1 WHERE unique1 < 
10;{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28065) ntile only accepting positive (>0) values

2019-07-24 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28065:
-
Description: 
Currently, Spark does not accept null as an input for `ntile`, or zero, however 
Postgres supports it.
 Example:
{code:sql}
SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
{code}

  was:
Currently, Spark does not accept null as an input for `ntile`, however Postgres 
supports it.
Example:

{code:sql}
SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
{code}



> ntile only accepting positive (>0) values
> -
>
> Key: SPARK-28065
> URL: https://issues.apache.org/jira/browse/SPARK-28065
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not accept null as an input for `ntile`, or zero, 
> however Postgres supports it.
>  Example:
> {code:sql}
> SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28065) ntile only accepting posivile (>0) values

2019-07-24 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28065:
-
Summary: ntile only accepting posivile (>0) values  (was: ntile does not 
accept NULL as input)

> ntile only accepting posivile (>0) values
> -
>
> Key: SPARK-28065
> URL: https://issues.apache.org/jira/browse/SPARK-28065
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not accept null as an input for `ntile`, however 
> Postgres supports it.
> Example:
> {code:sql}
> SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28065) ntile only accepting positive (>0) values

2019-07-24 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-28065:
-
Summary: ntile only accepting positive (>0) values  (was: ntile only 
accepting posivile (>0) values)

> ntile only accepting positive (>0) values
> -
>
> Key: SPARK-28065
> URL: https://issues.apache.org/jira/browse/SPARK-28065
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not accept null as an input for `ntile`, however 
> Postgres supports it.
> Example:
> {code:sql}
> SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28500) adds support for `filter` clause

2019-07-24 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28500:


 Summary: adds support for `filter` clause
 Key: SPARK-28500
 URL: https://issues.apache.org/jira/browse/SPARK-28500
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Definition: "The {{filter}} clause extends aggregate functions ({{sum}}, 
{{avg}}, {{count}}, …) by an additional {{where}} clause. The result of the 
aggregate is built from only the rows that satisfy the additional {{where}} 
clause too." [source|[https://modern-sql.com/feature/filter]]

Also, PgSQL currently support `filter` while Spark doesn't.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28429) SQL Datetime util function being casted to double instead of timestamp

2019-07-17 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28429:


 Summary: SQL Datetime util function being casted to double instead 
of timestamp
 Key: SPARK-28429
 URL: https://issues.apache.org/jira/browse/SPARK-28429
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes


In the code below, 'now()+'100 days' are casted to double and then an error is 
thrown:
{code:sql}
CREATE TEMP VIEW v_window AS
SELECT i, min(i) over (order by i range between '1 day' preceding and '10 days' 
following) as min_i
FROM range(now(), now()+'100 days', '1 hour') i;
{code}
Error:

{code:sql}
cannot resolve '(current_timestamp() + CAST('100 days' AS DOUBLE))' due to data 
type mismatch: differing      types in '(current_timestamp() + CAST('100 days' 
AS DOUBLE))' (timestamp and double).;{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28428) Spark `exclude` always expecting `()`

2019-07-17 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28428:


 Summary: Spark `exclude` always expecting `()` 
 Key: SPARK-28428
 URL: https://issues.apache.org/jira/browse/SPARK-28428
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.0.0
Reporter: Dylan Guedes


SparkSQL `exclude` always expects a following call to `()`, however, PgSQL 
`exclude` does not. Examples:

{code:sql}
SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude no 
others),
unique1, four
FROM tenk1 WHERE unique1 < 10;
{code}




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28086) Adds `random()` sql function

2019-07-16 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886382#comment-16886382
 ] 

Dylan Guedes commented on SPARK-28086:
--

Well, to be fair I've created the JIRA because the `rand()` looks like a number 
generator, while `random()` (available at PgSQL) seems like a "pick any 
available value". For instance: you may use `order by random()` in PgSQL, 
however, in Spark `order by rand()` is not valid. But, I'm probably wrong: 
maybe it is related with PgSQL `order by` accepting literal values while Spark 
not.

> Adds `random()` sql function
> 
>
> Key: SPARK-28086
> URL: https://issues.apache.org/jira/browse/SPARK-28086
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not have a `random()` function. Postgres, however, does.
> For instance, this one is not valid:
> {code:sql}
> SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()))
> {code}
> Because of the `random()` call. On the other hand, [Postgres has 
> it.|https://www.postgresql.org/docs/8.2/functions-math.html]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27767) Built-in function: generate_series

2019-06-17 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865912#comment-16865912
 ] 

Dylan Guedes edited comment on SPARK-27767 at 6/17/19 8:02 PM:
---

[~smilegator] by the way, I just checked and there is a (minor) difference: 
when you use `range()` and define it as a sub-query called `x`, for instance, 
the default name for the column became `x.id`, instead of just `x`, that is the 
behaviour in Postgres. For instance:

{code:sql}
from range(-32766, -32764) x;
{code}
In Spark, looks like you should reference to these values as `x.id`. Meanwhile, 
in Postgres you can call them through just `x`. 

EDIT: Btw, this call also does not work:

{code:sql}
SELECT range(1, 100) OVER () FROM empsalary
{code}



was (Author: dylanguedes):
[~smilegator] by the way, I just checked and there is a (minor) difference: 
when you use `range()` and define it as a sub-query called `x`, for instance, 
the default name for the column became `x.id`, instead of just `x`, that is the 
behaviour in Postgres. For instance:

{code:sql}
from range(-32766, -32764) x;
{code}
In Spark, looks like you should reference to these values as `x.id`. Meanwhile, 
in Postgres you can call them through just `x`. 

> Built-in function: generate_series
> --
>
> Key: SPARK-27767
> URL: https://issues.apache.org/jira/browse/SPARK-27767
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> [https://www.postgresql.org/docs/9.1/functions-srf.html]
> generate_series(start, stop): Generate a series of values, from start to stop 
> with a step size of one
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27767) Built-in function: generate_series

2019-06-17 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865912#comment-16865912
 ] 

Dylan Guedes commented on SPARK-27767:
--

[~smilegator] by the way, I just checked and there is a (minor) difference: 
when you use `range()` and define it as a sub-query called `x`, for instance, 
the default name for the column became `x.id`, instead of just `x`, that is the 
behaviour in Postgres. For instance:

{code:sql}
from range(-32766, -32764) x;
{code}
In Spark, looks like you should reference to these values as `x.id`. Meanwhile, 
in Postgres you can call them through just `x`. 

> Built-in function: generate_series
> --
>
> Key: SPARK-27767
> URL: https://issues.apache.org/jira/browse/SPARK-27767
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> [https://www.postgresql.org/docs/9.1/functions-srf.html]
> generate_series(start, stop): Generate a series of values, from start to stop 
> with a step size of one
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28086) Adds `random()` sql function

2019-06-17 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28086:


 Summary: Adds `random()` sql function
 Key: SPARK-28086
 URL: https://issues.apache.org/jira/browse/SPARK-28086
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark does not have a `random()` function. Postgres, however, does.

For instance, this one is not valid:

{code:sql}
SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()))
{code}

Because of the `random()` call. On the other hand, [Postgres has 
it.|https://www.postgresql.org/docs/8.2/functions-math.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28068) `lag` second argument must be a literal

2019-06-16 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28068:


 Summary: `lag` second argument must be a literal
 Key: SPARK-28068
 URL: https://issues.apache.org/jira/browse/SPARK-28068
 Project: Spark
  Issue Type: Task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently in Spark, `lag` (and, possible, some other window functions) requires 
the 2nd argument to be a literal.
For example, this is not allowed:

{code:sql}
SELECT lag(ten, four) OVER (PARTITION BY four ORDER BY ten), ten, four FROM 
tenk1 WHERE unique2 < 10;
{code}
However, this one works:

{code:sql}
SELECT lag(ten, 2) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 
WHERE unique2 < 10;
{code}

In comparison, Postgres accepts a literal as a 2nd argument. I found this issue 
while porting `window.sql` tests from Postgres to Spark



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28065) ntile does not accept NULL as input

2019-06-16 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28065:


 Summary: ntile does not accept NULL as input
 Key: SPARK-28065
 URL: https://issues.apache.org/jira/browse/SPARK-28065
 Project: Spark
  Issue Type: Task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently, Spark does not accept null as an input for `ntile`, however Postgres 
supports it.
Example:

{code:sql}
SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28064) Order by does not accept a call to rank()

2019-06-16 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-28064:


 Summary: Order by does not accept a call to rank()
 Key: SPARK-28064
 URL: https://issues.apache.org/jira/browse/SPARK-28064
 Project: Spark
  Issue Type: Task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dylan Guedes


Currently in Spark, we can't use a call to `rank()` in a order by; we need to 
first rename the rank column to, for instance, `r` and then, use `order by r`. 
For example:
 This does not work:


{code:sql}
 SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS 
(PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
{code}

However, this one does:
{code:sql}
 SELECT depname, empno, salary, rank() OVER w as r FROM empsalary WINDOW w AS 
(PARTITION BY depname ORDER BY salary) ORDER BY r;
{code}

By the way, I took this one from Postgres behavior: postgres accept both ways.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23160) Add window.sql

2019-06-15 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864692#comment-16864692
 ] 

Dylan Guedes commented on SPARK-23160:
--

Thank you! I'll be working on this, then.

> Add window.sql
> --
>
> Key: SPARK-23160
> URL: https://issues.apache.org/jira/browse/SPARK-23160
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.0.0
>Reporter: Xingbo Jiang
>Priority: Minor
>
> In this ticket, we plan to add the regression test cases of 
> https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/window.sql.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23098) Migrate Kafka batch source to v2

2019-05-08 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835693#comment-16835693
 ] 

Dylan Guedes commented on SPARK-23098:
--

[~gsomogyi] No, I didn't picked this one. Whatever, happy to see you interested 
on it! 

> Migrate Kafka batch source to v2
> 
>
> Key: SPARK-23098
> URL: https://issues.apache.org/jira/browse/SPARK-23098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Jose Torres
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27138) Remove AdminUtils calls

2019-03-12 Thread Dylan Guedes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan Guedes updated SPARK-27138:
-
Description: KafkaTestUtils (from kafka010) currently uses AdminUtils to 
create and delete topics for test suites (what is currently deprecated). Since 
it will stop to work at some point, I think that it is a good opportunity to 
change the API calls.  (was: KafkaTestUtils (from kafka010) currently uses 
AdminUtils to create and delete topics for test suites (what is currently 
deprecated). Since it will stop to work at some point, I think that it is a 
good opportunity.)

> Remove AdminUtils calls
> ---
>
> Key: SPARK-27138
> URL: https://issues.apache.org/jira/browse/SPARK-27138
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 2.4.0
>Reporter: Dylan Guedes
>Priority: Minor
>
> KafkaTestUtils (from kafka010) currently uses AdminUtils to create and delete 
> topics for test suites (what is currently deprecated). Since it will stop to 
> work at some point, I think that it is a good opportunity to change the API 
> calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27138) Remove AdminUtils calls

2019-03-12 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-27138:


 Summary: Remove AdminUtils calls
 Key: SPARK-27138
 URL: https://issues.apache.org/jira/browse/SPARK-27138
 Project: Spark
  Issue Type: Task
  Components: Tests
Affects Versions: 2.4.0
Reporter: Dylan Guedes


KafkaTestUtils (from kafka010) currently uses AdminUtils to create and delete 
topics for test suites (what is currently deprecated). Since it will stop to 
work at some point, I think that it is a good opportunity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23098) Migrate Kafka batch source to v2

2019-03-12 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790585#comment-16790585
 ] 

Dylan Guedes commented on SPARK-23098:
--

Hi,
I would like to work on this one. [~joseph.torres] do you mind in helping me 
with a few suggestions if I get really stuck? Also, is this one similar to the 
CSVReader/JSONReader?

> Migrate Kafka batch source to v2
> 
>
> Key: SPARK-23098
> URL: https://issues.apache.org/jira/browse/SPARK-23098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Jose Torres
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23160) Add more window sql tests

2019-03-11 Thread Dylan Guedes (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789877#comment-16789877
 ] 

Dylan Guedes commented on SPARK-23160:
--

Hi,

I would like to work on this one, but to be fair I didn't get the meaning of 
"tests in other major databases". [~jiangxb1987] do you remember what scenarios 
you had in mind?

> Add more window sql tests
> -
>
> Key: SPARK-23160
> URL: https://issues.apache.org/jira/browse/SPARK-23160
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xingbo Jiang
>Priority: Minor
>
> We should also cover the window sql interface, example in 
> `sql/core/src/test/resources/sql-tests/inputs/window.sql`, it should also be 
> funny to see whether we can generate consistent results for window tests in 
> other major databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23931) High-order function: zip(array1, array2[, ...]) → array

2018-05-11 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472514#comment-16472514
 ] 

Dylan Guedes commented on SPARK-23931:
--

[~mn-mikke] I updated with a working version! Would you mind in giving a 
feedback/suggestion? I've decided to use an array of structs since Java doesn't 
handle well Scala Tuple2's, but to be fair I'm not sure if it is the best 
choice.

> High-order function: zip(array1, array2[, ...]) → array
> 
>
> Key: SPARK-23931
> URL: https://issues.apache.org/jira/browse/SPARK-23931
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Merges the given arrays, element-wise, into a single array of rows. The M-th 
> element of the N-th argument will be the N-th field of the M-th output 
> element. If the arguments have an uneven length, missing values are filled 
> with NULL.
> {noformat}
> SELECT zip(ARRAY[1, 2], ARRAY['1b', null, '3b']); -- [ROW(1, '1b'), ROW(2, 
> null), ROW(null, '3b')]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23931) High-order function: zip(array1, array2[, ...]) → array

2018-05-11 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472161#comment-16472161
 ] 

Dylan Guedes commented on SPARK-23931:
--

Hi Marek! I finally get some progress, I think that more a few hours and I can 
complete this. Thank you!

> High-order function: zip(array1, array2[, ...]) → array
> 
>
> Key: SPARK-23931
> URL: https://issues.apache.org/jira/browse/SPARK-23931
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Merges the given arrays, element-wise, into a single array of rows. The M-th 
> element of the N-th argument will be the N-th field of the M-th output 
> element. If the arguments have an uneven length, missing values are filled 
> with NULL.
> {noformat}
> SELECT zip(ARRAY[1, 2], ARRAY['1b', null, '3b']); -- [ROW(1, '1b'), ROW(2, 
> null), ROW(null, '3b')]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23931) High-order function: zip(array1, array2[, ...]) → array

2018-04-12 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436184#comment-16436184
 ] 

Dylan Guedes commented on SPARK-23931:
--

I'm having some trouble so I asked for help in the PR - suggestions/feedback 
are welcome.

> High-order function: zip(array1, array2[, ...]) → array
> 
>
> Key: SPARK-23931
> URL: https://issues.apache.org/jira/browse/SPARK-23931
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Merges the given arrays, element-wise, into a single array of rows. The M-th 
> element of the N-th argument will be the N-th field of the M-th output 
> element. If the arguments have an uneven length, missing values are filled 
> with NULL.
> {noformat}
> SELECT zip(ARRAY[1, 2], ARRAY['1b', null, '3b']); -- [ROW(1, '1b'), ROW(2, 
> null), ROW(null, '3b')]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23931) High-order function: zip(array1, array2[, ...]) → array

2018-04-10 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432850#comment-16432850
 ] 

Dylan Guedes commented on SPARK-23931:
--

I would like to try this one.

> High-order function: zip(array1, array2[, ...]) → array
> 
>
> Key: SPARK-23931
> URL: https://issues.apache.org/jira/browse/SPARK-23931
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Merges the given arrays, element-wise, into a single array of rows. The M-th 
> element of the N-th argument will be the N-th field of the M-th output 
> element. If the arguments have an uneven length, missing values are filled 
> with NULL.
> {noformat}
> SELECT zip(ARRAY[1, 2], ARRAY['1b', null, '3b']); -- [ROW(1, '1b'), ROW(2, 
> null), ROW(null, '3b')]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20169) Groupby Bug with Sparksql

2018-03-15 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401029#comment-16401029
 ] 

Dylan Guedes commented on SPARK-20169:
--

Hi,

I also reproduced it in v2.3 and master. I think that it is something related 
to the String type because if I cast the jr dataframe column to long it works 
fine - However, if I cast it to String, the bug still happens.

I don't know the catalyst codebase that well (never touched it actually), do 
you guys have a suggestion to where to start looking after I call _jdf? I don't 
know how to follow the trace after converting to the JVM.

Thank you!

> Groupby Bug with Sparksql
> -
>
> Key: SPARK-20169
> URL: https://issues.apache.org/jira/browse/SPARK-20169
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Bin Wu
>Priority: Major
>
> We find a potential bug in Catalyst optimizer which cannot correctly 
> process "groupby". You can reproduce it by following simple example:
> =
> from pyspark.sql.functions import *
> #e=sc.parallelize([(1,2),(1,3),(1,4),(2,1),(3,1),(4,1)]).toDF(["src","dst"])
> e = spark.read.csv("graph.csv", header=True)
> r = sc.parallelize([(1,),(2,),(3,),(4,)]).toDF(['src'])
> r1 = e.join(r, 'src').groupBy('dst').count().withColumnRenamed('dst','src')
> jr = e.join(r1, 'src')
> jr.show()
> r2 = jr.groupBy('dst').count()
> r2.show()
> =
> FYI, "graph.csv" contains exactly the same data as the commented line.
> You can find that jr is:
> |src|dst|count|
> |  3|  1|1|
> |  1|  4|3|
> |  1|  3|3|
> |  1|  2|3|
> |  4|  1|1|
> |  2|  1|1|
> But, after the last groupBy, the 3 rows with dst = 1 are not grouped together:
> |dst|count|
> |  1|1|
> |  4|1|
> |  3|1|
> |  2|1|
> |  1|1|
> |  1|1|
> If we build jr directly from raw data (commented line), this error will not 
> show up.  So 
> we suspect  that there is a bug in the Catalyst optimizer when multiple joins 
> and groupBy's 
> are being optimized. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23648) extend hint syntax to support any expression for R

2018-03-11 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-23648:


 Summary: extend hint syntax to support any expression for R
 Key: SPARK-23648
 URL: https://issues.apache.org/jira/browse/SPARK-23648
 Project: Spark
  Issue Type: Sub-task
  Components: SparkR, SQL
Affects Versions: 2.3.0, 2.2.0
Reporter: Dylan Guedes


Relax checks in
[https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23647) extend hint syntax to support any expression for Python

2018-03-11 Thread Dylan Guedes (JIRA)
Dylan Guedes created SPARK-23647:


 Summary: extend hint syntax to support any expression for Python
 Key: SPARK-23647
 URL: https://issues.apache.org/jira/browse/SPARK-23647
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark, SQL
Affects Versions: 2.2.0
Reporter: Dylan Guedes


Relax checks in

[https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21030) extend hint syntax to support any expression for Python and R

2018-03-08 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392049#comment-16392049
 ] 

Dylan Guedes commented on SPARK-21030:
--

So, I started, and here is my progress: 
[https://github.com/DylanGuedes/spark/commit/433c622ae987f2b6e2a9a5bc97a0addc0d938d4b]

Could anyone give me a feedback/hints before I open the PR?

> extend hint syntax to support any expression for Python and R
> -
>
> Key: SPARK-21030
> URL: https://issues.apache.org/jira/browse/SPARK-21030
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SparkR, SQL
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Priority: Major
>
> See SPARK-20854
> we need to relax checks in 
> https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422
> and
> https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21030) extend hint syntax to support any expression for Python and R

2018-03-08 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392049#comment-16392049
 ] 

Dylan Guedes edited comment on SPARK-21030 at 3/8/18 10:50 PM:
---

So, I started, and here is my progress: 
[https://github.com/DylanGuedes/spark/commit/433c622ae987f2b6e2a9a5bc97a0addc0d938d4b]

Could anyone give me a feedback/hints before I open the PR? I'm not sure if my 
approach is correct.


was (Author: dylanguedes):
So, I started, and here is my progress: 
[https://github.com/DylanGuedes/spark/commit/433c622ae987f2b6e2a9a5bc97a0addc0d938d4b]

Could anyone give me a feedback/hints before I open the PR?

> extend hint syntax to support any expression for Python and R
> -
>
> Key: SPARK-21030
> URL: https://issues.apache.org/jira/browse/SPARK-21030
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SparkR, SQL
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Priority: Major
>
> See SPARK-20854
> we need to relax checks in 
> https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422
> and
> https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21030) extend hint syntax to support any expression for Python and R

2018-03-07 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390321#comment-16390321
 ] 

Dylan Guedes edited comment on SPARK-21030 at 3/7/18 10:11 PM:
---

Hi,

I would like to try this one (for Python). Do you guys think that this is a 
good one for a newcomer?

Thank you!


was (Author: dylanguedes):
Hi,

I would like to try this one. Do you guys think that this is a good one for a 
newcomer?

Thank you!

> extend hint syntax to support any expression for Python and R
> -
>
> Key: SPARK-21030
> URL: https://issues.apache.org/jira/browse/SPARK-21030
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SparkR, SQL
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Priority: Major
>
> See SPARK-20854
> we need to relax checks in 
> https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422
> and
> https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21030) extend hint syntax to support any expression for Python and R

2018-03-07 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390321#comment-16390321
 ] 

Dylan Guedes edited comment on SPARK-21030 at 3/7/18 10:11 PM:
---

Hi,

I would like to try this one (in Python). Do you guys think that this is a good 
one for a newcomer?

Thank you!


was (Author: dylanguedes):
Hi,

I would like to try this one (for Python). Do you guys think that this is a 
good one for a newcomer?

Thank you!

> extend hint syntax to support any expression for Python and R
> -
>
> Key: SPARK-21030
> URL: https://issues.apache.org/jira/browse/SPARK-21030
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SparkR, SQL
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Priority: Major
>
> See SPARK-20854
> we need to relax checks in 
> https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422
> and
> https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21030) extend hint syntax to support any expression for Python and R

2018-03-07 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390321#comment-16390321
 ] 

Dylan Guedes commented on SPARK-21030:
--

Hi,

I would like to try this one. Do you guys think that this is a good one for a 
newcomer?

Thank you!

> extend hint syntax to support any expression for Python and R
> -
>
> Key: SPARK-21030
> URL: https://issues.apache.org/jira/browse/SPARK-21030
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SparkR, SQL
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Priority: Major
>
> See SPARK-20854
> we need to relax checks in 
> https://github.com/apache/spark/blob/6cbc61d1070584ffbc34b1f53df352c9162f414a/python/pyspark/sql/dataframe.py#L422
> and
> https://github.com/apache/spark/blob/7f203a248f94df6183a4bc4642a3d873171fef29/R/pkg/R/DataFrame.R#L3746



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23595) Add interpreted execution for ValidateExternalType expression

2018-03-06 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388136#comment-16388136
 ] 

Dylan Guedes commented on SPARK-23595:
--

[~maropu] I checked your progress, and looks like you are almost finishing it, 
so It is fine. Whatever, your solution was very enlightening, thank you!

> Add interpreted execution for ValidateExternalType expression
> -
>
> Key: SPARK-23595
> URL: https://issues.apache.org/jira/browse/SPARK-23595
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Herman van Hovell
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23595) Add interpreted execution for ValidateExternalType expression

2018-03-06 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387853#comment-16387853
 ] 

Dylan Guedes commented on SPARK-23595:
--

Hi,

I would like to help with this issue, but since I am a newcomer I am not sure 
if it is a good way to start (maybe it is too hard and I don't want to be a 
bottleneck). I started reading code of the related issues, it is similar? What 
do you guys think?

Thank you!

> Add interpreted execution for ValidateExternalType expression
> -
>
> Key: SPARK-23595
> URL: https://issues.apache.org/jira/browse/SPARK-23595
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Herman van Hovell
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23595) Add interpreted execution for ValidateExternalType expression

2018-03-06 Thread Dylan Guedes (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387853#comment-16387853
 ] 

Dylan Guedes edited comment on SPARK-23595 at 3/6/18 2:24 PM:
--

Hi,

I would like to help with this issue, but since I am a newcomer I am not sure 
if it is a good way to start (maybe it is too hard and I don't want to be a 
bottleneck). I started reading code of the related issues, is this one similar? 
What do you guys think?

Thank you!


was (Author: dylanguedes):
Hi,

I would like to help with this issue, but since I am a newcomer I am not sure 
if it is a good way to start (maybe it is too hard and I don't want to be a 
bottleneck). I started reading code of the related issues, it is similar? What 
do you guys think?

Thank you!

> Add interpreted execution for ValidateExternalType expression
> -
>
> Key: SPARK-23595
> URL: https://issues.apache.org/jira/browse/SPARK-23595
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Herman van Hovell
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org