from:"Kousuke Saruta \(Jira\)"

[jira] [Updated] (SPARK-46627) Streaming UI hover-over shows incorrect value

2024-01-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-46627:
---
Issue Type: Bug  (was: Task)

> Streaming UI hover-over shows incorrect value
> -
>
> Key: SPARK-46627
> URL: https://issues.apache.org/jira/browse/SPARK-46627
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming, UI, Web UI
>Affects Versions: 4.0.0
>Reporter: Wei Liu
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot 
> 2024-01-08 at 15.06.24.png
>
>
> Running a simple streaming query:
> val df = spark.readStream.format("rate").option("rowsPerSecond", 
> "5000").load()
> val q = df.writeStream.format("noop").start()
>  
> The hover-over value is incorrect in the streaming ui (shows 321.00 at 
> undefined)
>  
> !Screenshot 2024-01-08 at 1.55.57 PM.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46627) Streaming UI hover-over shows incorrect value

2024-01-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-46627.

Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/44633

> Streaming UI hover-over shows incorrect value
> -
>
> Key: SPARK-46627
> URL: https://issues.apache.org/jira/browse/SPARK-46627
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming, UI, Web UI
>Affects Versions: 4.0.0
>Reporter: Wei Liu
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot 
> 2024-01-08 at 15.06.24.png
>
>
> Running a simple streaming query:
> val df = spark.readStream.format("rate").option("rowsPerSecond", 
> "5000").load()
> val q = df.writeStream.format("noop").start()
>  
> The hover-over value is incorrect in the streaming ui (shows 321.00 at 
> undefined)
>  
> !Screenshot 2024-01-08 at 1.55.57 PM.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46627) Streaming UI hover-over shows incorrect value

2024-01-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta reassigned SPARK-46627:
--

Assignee: Kent Yao

> Streaming UI hover-over shows incorrect value
> -
>
> Key: SPARK-46627
> URL: https://issues.apache.org/jira/browse/SPARK-46627
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming, UI, Web UI
>Affects Versions: 4.0.0
>Reporter: Wei Liu
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot 
> 2024-01-08 at 15.06.24.png
>
>
> Running a simple streaming query:
> val df = spark.readStream.format("rate").option("rowsPerSecond", 
> "5000").load()
> val q = df.writeStream.format("noop").start()
>  
> The hover-over value is incorrect in the streaming ui (shows 321.00 at 
> undefined)
>  
> !Screenshot 2024-01-08 at 1.55.57 PM.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44490) Remove TaskPagedTable in StagePage

2023-08-01 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-44490.

Fix Version/s: 4.0.0
 Assignee: dzcxzl
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/42085

> Remove TaskPagedTable in StagePage
> --
>
> Key: SPARK-44490
> URL: https://issues.apache.org/jira/browse/SPARK-44490
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.4.1
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Minor
> Fix For: 4.0.0
>
>
> In [SPARK-21809|https://issues.apache.org/jira/browse/SPARK-21809], we 
> introduced stagespage-template.html to show the running status of Stage. 
> TaskPagedTable is no longer effective, but there are still many PRs updating 
> related codes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44279) Upgrade optionator to ^0.9.3

2023-07-13 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-44279.

Target Version/s: 3.5.0
Assignee: Bjørn Jørgensen
  Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/41955

> Upgrade optionator to ^0.9.3
> 
>
> Key: SPARK-44279
> URL: https://issues.apache.org/jira/browse/SPARK-44279
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
>
> [Regular Expression Denial of Service (ReDoS) - 
> CVE-2023-26115|https://github.com/jonschlinkert/word-wrap/issues/32]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44279) Upgrade optionator to ^0.9.3

2023-07-13 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-44279:
---
Priority: Minor  (was: Major)

> Upgrade optionator to ^0.9.3
> 
>
> Key: SPARK-44279
> URL: https://issues.apache.org/jira/browse/SPARK-44279
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Bjørn Jørgensen
>Priority: Minor
>
> [Regular Expression Denial of Service (ReDoS) - 
> CVE-2023-26115|https://github.com/jonschlinkert/word-wrap/issues/32]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-41634) Upgrade minimatch to 3.1.2

2022-12-20 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-41634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-41634.

Fix Version/s: 3.4.0
 Assignee: Bjørn Jørgensen
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/39143

> Upgrade minimatch to 3.1.2 
> ---
>
> Key: SPARK-41634
> URL: https://issues.apache.org/jira/browse/SPARK-41634
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Minor
> Fix For: 3.4.0
>
>
> [CVE-2022-3517|https://nvd.nist.gov/vuln/detail/CVE-2022-3517]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-41587) Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7

2022-12-20 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-41587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-41587.

Fix Version/s: 3.4.0
 Assignee: Yang Jie
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/39129

> Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7
> 
>
> Key: SPARK-41587
> URL: https://issues.apache.org/jira/browse/SPARK-41587
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
> Fix For: 3.4.0
>
>
> https://github.com/scalatest/scalatestplus-selenium/releases/tag/release-3.2.14.0-for-selenium-4.7



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-40397) Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium to 3.2.13.0

2022-09-14 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-40397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-40397.

Fix Version/s: 3.4.0
 Assignee: Yang Jie
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/37868

> Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium 
> to 3.2.13.0
> 
>
> Key: SPARK-40397
> URL: https://issues.apache.org/jira/browse/SPARK-40397
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-38303) Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev

2022-02-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-38303.

Fix Version/s: 3.3.0
   3.2.2
 Assignee: Bjørn Jørgensen
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35628

> Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
> --
>
> Key: SPARK-38303
> URL: https://issues.apache.org/jira/browse/SPARK-38303
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Major
> Fix For: 3.3.0, 3.2.2
>
>
> [CVE-2021-3807|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3807]
>   
> [releases notes at github|https://github.com/chalk/ansi-regex/releases]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38303) Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev

2022-02-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38303:
---
Affects Version/s: 3.2.1

> Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
> --
>
> Key: SPARK-38303
> URL: https://issues.apache.org/jira/browse/SPARK-38303
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Bjørn Jørgensen
>Priority: Major
>
> [CVE-2021-3807|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3807]
>   
> [releases notes at github|https://github.com/chalk/ansi-regex/releases]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-38278) Add SparkContext.addArchive in PySpark

2022-02-22 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-38278.

  Assignee: Hyukjin Kwon
Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35603

> Add SparkContext.addArchive in PySpark
> --
>
> Key: SPARK-38278
> URL: https://issues.apache.org/jira/browse/SPARK-38278
> Project: Spark
>  Issue Type: New Feature
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> SPARK-33530 added {{SparkContext.addArchive}} API. We should have one in 
> PySpark too.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38278) Add SparkContext.addArchive in PySpark

2022-02-22 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38278:
---
Fix Version/s: 3.3.0

> Add SparkContext.addArchive in PySpark
> --
>
> Key: SPARK-38278
> URL: https://issues.apache.org/jira/browse/SPARK-38278
> Project: Spark
>  Issue Type: New Feature
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.3.0
>
>
> SPARK-33530 added {{SparkContext.addArchive}} API. We should have one in 
> PySpark too.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36808) Upgrade Kafka to 2.8.1

2022-02-15 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-36808:
---
Fix Version/s: 3.2.2

> Upgrade Kafka to 2.8.1
> --
>
> Key: SPARK-36808
> URL: https://issues.apache.org/jira/browse/SPARK-36808
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0, 3.2.2
>
>
> A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix.
> https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36808) Upgrade Kafka to 2.8.1

2022-02-15 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-36808:
---
Affects Version/s: 3.2.1

> Upgrade Kafka to 2.8.1
> --
>
> Key: SPARK-36808
> URL: https://issues.apache.org/jira/browse/SPARK-36808
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0
>
>
> A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix.
> https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36808) Upgrade Kafka to 2.8.1

2022-02-15 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492452#comment-17492452
 ] 

Kousuke Saruta commented on SPARK-36808:


Ah, O.K. I misunderstood. I'll withdraw the PRs.




> Upgrade Kafka to 2.8.1
> --
>
> Key: SPARK-36808
> URL: https://issues.apache.org/jira/browse/SPARK-36808
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0
>
>
> A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix.
> https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36808) Upgrade Kafka to 2.8.1

2022-02-14 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492408#comment-17492408
 ] 

Kousuke Saruta commented on SPARK-36808:


[~dongjoon] Sure, I'll do it.

> Upgrade Kafka to 2.8.1
> --
>
> Key: SPARK-36808
> URL: https://issues.apache.org/jira/browse/SPARK-36808
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0
>
>
> A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix.
> https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-38149) Upgrade joda-time to 2.10.13

2022-02-08 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-38149:
--

 Summary: Upgrade joda-time to 2.10.13
 Key: SPARK-38149
 URL: https://issues.apache.org/jira/browse/SPARK-38149
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


joda-time 2.10.13 was released, which supports the latest TZ database of 2021e.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37934) Upgrade Jetty version to 9.4.44

2022-02-08 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489017#comment-17489017
 ] 

Kousuke Saruta commented on SPARK-37934:


Issue resolved in https://github.com/apache/spark/pull/35442 for branch-3.2.

> Upgrade Jetty version to 9.4.44
> ---
>
> Key: SPARK-37934
> URL: https://issues.apache.org/jira/browse/SPARK-37934
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Sajith A
>Assignee: Sajith A
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
>
> Upgrade Jetty version to 9.4.44.v20210927 in current Spark master to bring-in 
> the fixes for the 
> [jetty#6973|https://github.com/eclipse/jetty.project/issues/6973] issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37934) Upgrade Jetty version to 9.4.44

2022-02-08 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37934:
---
Fix Version/s: 3.2.2

> Upgrade Jetty version to 9.4.44
> ---
>
> Key: SPARK-37934
> URL: https://issues.apache.org/jira/browse/SPARK-37934
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Sajith A
>Assignee: Sajith A
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
>
> Upgrade Jetty version to 9.4.44.v20210927 in current Spark master to bring-in 
> the fixes for the 
> [jetty#6973|https://github.com/eclipse/jetty.project/issues/6973] issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38087) select doesnt validate if the column already exists

2022-02-06 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38087:
---
Component/s: SQL
 (was: Spark Core)

> select doesnt validate if the column already exists
> ---
>
> Key: SPARK-38087
> URL: https://issues.apache.org/jira/browse/SPARK-38087
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1
> Environment: Version{{{}v3.2.1{}}}
> {{}}
> {{{}{}}}Master{{{}local[*]{}}}
> {{(Reproducible in any environment)}}
>Reporter: Deepa Vasanthkumar
>Priority: Minor
> Attachments: select vs drop.png
>
>
>  
> Select doesnt validate whether the alias column is already present in the 
> dataframe. 
> After which, we cannot do anything in that dataframe on that column. 
> df4 = df2.select(df2.firstname, df2.lastname) --> throws analysis exception
> df4.show()
>  
> However drop will not let you drop the said column. 
>  
> Scenario to reproduce :
> df2 = df1.select("*", (df1.firstname).alias("firstname"))   ---> this will 
> add same column
> df2.show() 
> df2.drop(df2.firstname) --> this will give AnalysisException: Reference 
> 'firstname' is ambiguous, could be: firstname, firstname.
>  
>  
> Is this expected behavior .
>   !select vs drop.png!
> !image-2022-02-02-06-28-23-543.png!
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-38021) Upgrade dropwizard metrics from 4.2.2 to 4.2.7

2022-01-25 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-38021.

Fix Version/s: 3.3.0
 Assignee: Yang Jie
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35317.

> Upgrade dropwizard metrics from 4.2.2 to 4.2.7
> --
>
> Key: SPARK-38021
> URL: https://issues.apache.org/jira/browse/SPARK-38021
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.3.0
>
>
> dropwizard metrics has released 5 versions after 4.2.2：
>  * [https://github.com/dropwizard/metrics/releases/tag/v4.2.3]
>  * [https://github.com/dropwizard/metrics/releases/tag/v4.2.4]
>  * [https://github.com/dropwizard/metrics/releases/tag/v4.2.5]
>  * [https://github.com/dropwizard/metrics/releases/tag/v4.2.6]
>  * [https://github.com/dropwizard/metrics/releases/tag/v4.2.7]
>  
> And after 4.2.5 version, codahale metrics supports build with JDK 17 
> (https://github.com/dropwizard/metrics/pull/2180)
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn

2022-01-25 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-38017.

Fix Version/s: 3.3.0
   3.2.2
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35313.

> Fix the API doc for window to say it supports TimestampNTZType too as 
> timeColumn
> 
>
> Key: SPARK-38017
> URL: https://issues.apache.org/jira/browse/SPARK-38017
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
>
> window function supports not only TimestampType but also TimestampNTZType but 
> the API docs doesn't mention TimestampNTZType.
> This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I 
> separate the tickets.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn

2022-01-25 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-38016.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35312.

> Fix the API doc for session_window to say it supports TimestampNTZType too as 
> timeColumn
> 
>
> Key: SPARK-38016
> URL: https://issues.apache.org/jira/browse/SPARK-38016
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> As of Spark 3.3.0, session_window supports not only TimestampType but also 
> TimestampNTZType but the API docs doesn't mention TimestampNTZType.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn

2022-01-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38016:
---
Description: As of Spark 3.3.0, session_window supports not only 
TimestampType but also TimestampNTZType but the API docs doesn't mention 
TimestampNTZType.  (was: As of Spark 3.3.0, session_window supports not only 
TimestampType but also TimestampNTZType but the API docs mention 
TimestampNTZType.)

> Fix the API doc for session_window to say it supports TimestampNTZType too as 
> timeColumn
> 
>
> Key: SPARK-38016
> URL: https://issues.apache.org/jira/browse/SPARK-38016
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> As of Spark 3.3.0, session_window supports not only TimestampType but also 
> TimestampNTZType but the API docs doesn't mention TimestampNTZType.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn

2022-01-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38017:
---
Description: 
window function supports not only TimestampType but also TimestampNTZType but 
the API docs doesn't mention TimestampNTZType.

This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I 
separate the tickets.

  was:
window function supports not only TimestampType but also TimestampNTZType but 
the API docs mention TimestampNTZType.

This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I 
separate the tickets.


> Fix the API doc for window to say it supports TimestampNTZType too as 
> timeColumn
> 
>
> Key: SPARK-38017
> URL: https://issues.apache.org/jira/browse/SPARK-38017
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> window function supports not only TimestampType but also TimestampNTZType but 
> the API docs doesn't mention TimestampNTZType.
> This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I 
> separate the tickets.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn

2022-01-24 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-38017:
--

 Summary: Fix the API doc for window to say it supports 
TimestampNTZType too as timeColumn
 Key: SPARK-38017
 URL: https://issues.apache.org/jira/browse/SPARK-38017
 Project: Spark
  Issue Type: Bug
  Components: Documentation, SQL
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


window function supports not only TimestampType but also TimestampNTZType but 
the API docs mention TimestampNTZType.

This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I 
separate the tickets.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn

2022-01-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38016:
---
Summary: Fix the API doc for session_window to say it supports 
TimestampNTZType too as timeColumn  (was: Fix the API doc for session_window to 
say it supports TimestampNTZType too as timeColumn.)

> Fix the API doc for session_window to say it supports TimestampNTZType too as 
> timeColumn
> 
>
> Key: SPARK-38016
> URL: https://issues.apache.org/jira/browse/SPARK-38016
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> As of Spark 3.3.0, session_window supports not only TimestampType but also 
> TimestampNTZType but the API docs mention TimestampNTZType.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn.

2022-01-24 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-38016:
---
Summary: Fix the API doc for session_window to say it supports 
TimestampNTZType too as timeColumn.  (was: Fix the API doc for window and 
session_window to say it supports TimestampNTZType too as timeColumn.)

> Fix the API doc for session_window to say it supports TimestampNTZType too as 
> timeColumn.
> -
>
> Key: SPARK-38016
> URL: https://issues.apache.org/jira/browse/SPARK-38016
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> As of Spark 3.3.0, session_window supports not only TimestampType but also 
> TimestampNTZType but the API docs mention TimestampNTZType.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-38016) Fix the API doc for window and session_window to say it supports TimestampNTZType too as timeColumn.

2022-01-24 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-38016:
--

 Summary: Fix the API doc for window and session_window to say it 
supports TimestampNTZType too as timeColumn.
 Key: SPARK-38016
 URL: https://issues.apache.org/jira/browse/SPARK-38016
 Project: Spark
  Issue Type: Bug
  Components: Documentation, SQL
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


As of Spark 3.3.0, session_window supports not only TimestampType but also 
TimestampNTZType but the API docs mention TimestampNTZType.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472487#comment-17472487
 ] 

Kousuke Saruta commented on SPARK-37860:


Note: If the vote of Spark 3.2.1 RC1 passes, replace the fix version of 3.2.1 
with 3.2.2.

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Assignee: Jackey Lee
>Priority: Major
> Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37860.

Fix Version/s: 3.1.3
   3.0.4
   3.2.1
   3.3.0
 Assignee: Jackey Lee
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35160

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Assignee: Jackey Lee
>Priority: Major
> Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2022-01-10 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472434#comment-17472434
 ] 

Kousuke Saruta commented on SPARK-37159:


All right. Thank you [~dongjoon]!

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
>

[jira] [Resolved] (SPARK-37792) Spark shell sets log level to INFO by default

2022-01-04 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37792.

Fix Version/s: 3.3.0
 Assignee: L. C. Hsieh  (was: Apache Spark)
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35080

> Spark shell sets log level to INFO by default
> -
>
> Key: SPARK-37792
> URL: https://issues.apache.org/jira/browse/SPARK-37792
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Shell
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: L. C. Hsieh
>Priority: Major
> Fix For: 3.3.0
>
>
> {code}
> ./bin/spark-shell
> {code}
> {code}
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> 21/12/31 10:55:04 INFO SignalUtils: Registering signal handler for INT
> 21/12/31 10:55:08 INFO HiveConf: Found configuration file null
> 21/12/31 10:55:08 INFO SparkContext: Running Spark version 3.3.0-SNAPSHOT
> ...
> 21/12/31 10:55:09 INFO BlockManager: Initialized BlockManager: 
> BlockManagerId(driver, ..., None)
> ...
> Welcome to
>     __
>  / __/__  ___ _/ /__
> _\ \/ _ \/ _ `/ __/  '_/
>/___/ .__/\_,_/_/ /_/\_\   version 3.3.0-SNAPSHOT
>   /_/
> Using Scala version 2.12.15 (Java HotSpot(TM) 64-Bit Server VM, Java 
> 1.8.0_291)
> Type in expressions to have them evaluated.
> Type :help for more information.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37778) Upgrade SBT to 1.6.1

2021-12-29 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37778:
--

 Summary: Upgrade SBT to 1.6.1
 Key: SPARK-37778
 URL: https://issues.apache.org/jira/browse/SPARK-37778
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


SBT 1.6.1 was released, which log4j 2 to 2.17.1 for CVE-2021-44832.
https://github.com/sbt/sbt/releases/tag/v1.6.1



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37391) SIGNIFICANT bottleneck introduced by fix for SPARK-32001

2021-12-23 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37391.

Fix Version/s: 3.3.0
 Assignee: Danny Guinther
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34745 for Spark 3.3.0.

> SIGNIFICANT bottleneck introduced by fix for SPARK-32001
> 
>
> Key: SPARK-37391
> URL: https://issues.apache.org/jira/browse/SPARK-37391
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0
> Environment: N/A
>Reporter: Danny Guinther
>Assignee: Danny Guinther
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: so-much-blocking.jpg, spark-regression-dashes.jpg
>
>
> The fix for https://issues.apache.org/jira/browse/SPARK-32001 ( 
> [https://github.com/apache/spark/pull/29024/files#diff-345beef18081272d77d91eeca2d9b5534ff6e642245352f40f4e9c9b8922b085R58]
>  ) does not seem to have consider the reality that some apps may rely on 
> being able to establish many JDBC connections simultaneously for performance 
> reasons.
> The fix forces concurrency to 1 when establishing database connections and 
> that strikes me as a *significant* user impacting change and a *significant* 
> bottleneck.
> Can anyone propose a workaround for this? I have an app that makes 
> connections to thousands of databases and I can't upgrade to any version 
> >3.1.x because of this significant bottleneck.
>  
> Thanks in advance for your help!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37663) Mitigate ConcurrentModificationException thrown from tests in SparkContextSuite

2021-12-16 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37663:
---
Summary: Mitigate ConcurrentModificationException thrown from tests in 
SparkContextSuite  (was: Mitigate ConcurrentModificationException thrown from a 
test in SparkContextSuite)

> Mitigate ConcurrentModificationException thrown from tests in 
> SparkContextSuite
> ---
>
> Key: SPARK-37663
> URL: https://issues.apache.org/jira/browse/SPARK-37663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> ConcurrentModificationException can be thrown from tests in SparkContextSuite 
> with Scala 2.13.
> The cause seems to be same as SPARK-37315.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37663) SPARK-37315][ML][TEST] Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite

2021-12-16 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37663:
--

 Summary: SPARK-37315][ML][TEST] Mitigate 
ConcurrentModificationException thrown from a test in SparkContextSuite
 Key: SPARK-37663
 URL: https://issues.apache.org/jira/browse/SPARK-37663
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


ConcurrentModificationException can be thrown from tests in SparkContextSuite 
with Scala 2.13.
The cause seems to be same as SPARK-37315.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37663) Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite

2021-12-16 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37663:
---
Summary: Mitigate ConcurrentModificationException thrown from a test in 
SparkContextSuite  (was: SPARK-37315][ML][TEST] Mitigate 
ConcurrentModificationException thrown from a test in SparkContextSuite)

> Mitigate ConcurrentModificationException thrown from a test in 
> SparkContextSuite
> 
>
> Key: SPARK-37663
> URL: https://issues.apache.org/jira/browse/SPARK-37663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> ConcurrentModificationException can be thrown from tests in SparkContextSuite 
> with Scala 2.13.
> The cause seems to be same as SPARK-37315.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37656) Upgrade SBT to 1.5.7

2021-12-15 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37656:
--

 Summary: Upgrade SBT to 1.5.7
 Key: SPARK-37656
 URL: https://issues.apache.org/jira/browse/SPARK-37656
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.2.1, 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


SBT 1.5.7 was released a few hours ago, which includes a fix for CVE-2021-45046.
https://github.com/sbt/sbt/releases/tag/v1.5.7



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37635) SHOW TBLPROPERTIES should print the fully qualified table name

2021-12-14 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37635.

Fix Version/s: 3.3.0
 Assignee: Wenchen Fan
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34890

> SHOW TBLPROPERTIES should print the fully qualified table name
> --
>
> Key: SPARK-37635
> URL: https://issues.apache.org/jira/browse/SPARK-37635
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37310) Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default

2021-12-14 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37310.

Fix Version/s: 3.3.0
 Assignee: Terry Kim  (was: Apache Spark)
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34891

> Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default
> ---
>
> Key: SPARK-37310
> URL: https://issues.apache.org/jira/browse/SPARK-37310
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
> Fix For: 3.3.0
>
>
> Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36038) Basic speculation metrics at stage level

2021-12-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-36038.

  Assignee: Thejdeep Gudivada
Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34607

> Basic speculation metrics at stage level
> 
>
> Key: SPARK-36038
> URL: https://issues.apache.org/jira/browse/SPARK-36038
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Venkata krishnan Sowrirajan
>Assignee: Thejdeep Gudivada
>Priority: Major
> Fix For: 3.3.0
>
>
> Currently there are no speculation metrics available either at application 
> level or at stage level. With in our platform, we have added speculation 
> metrics at stage level as a summary similarly to the stage level metrics 
> tracking numTotalSpeculated, numCompleted (successful), numFailed, numKilled 
> etc. This enables us to effectively understand speculative execution feature 
> at an application level and helps in further tuning the speculation configs.
> cc [~ron8hu]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37586) Add cipher mode option and set default cipher mode for aes_encrypt and aes_decrypt

2021-12-08 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37586.

Fix Version/s: 3.3.0
 Assignee: Max Gekk
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34837

> Add cipher mode option and set default cipher mode for aes_encrypt and 
> aes_decrypt
> --
>
> Key: SPARK-37586
> URL: https://issues.apache.org/jira/browse/SPARK-37586
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
> Fix For: 3.3.0
>
>
> https://github.com/apache/spark/pull/32801 added aes_encrypt/aes_decrypt 
> functions to spark. However they rely on the jvm's configuration regarding 
> which cipher mode to support, this is problematic as it is not fixed across 
> versions and systems.
> Let's hardcode a default cipher mode and also allow users to set a cipher 
> mode as an argument to the function.
> In the future, we can support other modes like GCM and CBC that have been 
> already supported by other systems:
> # Snowflake: 
> https://docs.snowflake.com/en/sql-reference/functions/encrypt.html
> # Bigquery: 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/aead-encryption-concepts#block_cipher_modes



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454960#comment-17454960
 ] 

Kousuke Saruta commented on SPARK-37568:


[~yoda-mon] OK, please go ahead.

> Support 2-arguments by the convert_timezone() function
> --
>
> Key: SPARK-37568
> URL: https://issues.apache.org/jira/browse/SPARK-37568
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Priority: Major
>
> # If sourceTs is a timestamp_ntz, take the sourceTz from the session time 
> zone, see the SQL config spark.sql.session.timeZone
> # If sourceTs is a timestamp_ltz, convert it to a timestamp_ntz using the 
> targetTz



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function

2021-12-07 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454947#comment-17454947
 ] 

Kousuke Saruta commented on SPARK-37568:


cc: [~yoda-mon] [~YActs] Do you want to work on this?

> Support 2-arguments by the convert_timezone() function
> --
>
> Key: SPARK-37568
> URL: https://issues.apache.org/jira/browse/SPARK-37568
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Priority: Major
>
> # If sourceTs is a timestamp_ntz, take the sourceTz from the session time 
> zone, see the SQL config spark.sql.session.timeZone
> # If sourceTs is a timestamp_ltz, convert it to a timestamp_ntz using the 
> targetTz



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37469) Unified "fetchWaitTime" and "shuffleReadTime" metrics On UI

2021-12-06 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37469.

Fix Version/s: 3.3.0
 Assignee: Yazhi Wang
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34720

> Unified "fetchWaitTime" and "shuffleReadTime" metrics On UI
> ---
>
> Key: SPARK-37469
> URL: https://issues.apache.org/jira/browse/SPARK-37469
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.2.0
>Reporter: Yazhi Wang
>Assignee: Yazhi Wang
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: executor-page.png, sql-page.png
>
>
> Metrics in Executor/Task page shown as "
> Shuffle Read Block Time", and the SQL page shown as "fetch wait time" which 
> make us confused  !executor-page.png!
> !sql-page.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-02 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37529:
--

 Summary: Support K8s integration tests for Java 17
 Key: SPARK-37529
 URL: https://issues.apache.org/jira/browse/SPARK-37529
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes, Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Now that we can build container image for Java 17, let's support K8s 
integration tests for Java 17.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37487) CollectMetrics is executed twice if it is followed by a sort

2021-11-30 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451376#comment-17451376
 ] 

Kousuke Saruta commented on SPARK-37487:


[~tanelk] Thank you for pinging me.
I think a sampling job for the global sort performs the extra CollectMetrics 
(operations before the sort are performed twice).
Please let me look into more.

> CollectMetrics is executed twice if it is followed by a sort
> 
>
> Key: SPARK-37487
> URL: https://issues.apache.org/jira/browse/SPARK-37487
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Tanel Kiis
>Priority: Major
>  Labels: correctness
>
> It is best examplified by this new UT in DataFrameCallbackSuite:
> {code}
>   test("SPARK-37487: get observable metrics with sort by callback") {
> val df = spark.range(100)
>   .observe(
> name = "my_event",
> min($"id").as("min_val"),
> max($"id").as("max_val"),
> // Test unresolved alias
> sum($"id"),
> count(when($"id" % 2 === 0, 1)).as("num_even"))
>   .observe(
> name = "other_event",
> avg($"id").cast("int").as("avg_val"))
>   .sort($"id".desc)
> validateObservedMetrics(df)
>   }
> {code}
> The count and sum aggregate report twice the number of rows:
> {code}
> [info] - SPARK-37487: get observable metrics with sort by callback *** FAILED 
> *** (169 milliseconds)
> [info]   [0,99,9900,100] did not equal [0,99,4950,50] 
> (DataFrameCallbackSuite.scala:342)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
> [info]   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
> [info]   at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
> [info]   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
> [info]   at 
> org.apache.spark.sql.util.DataFrameCallbackSuite.checkMetrics$1(DataFrameCallbackSuite.scala:342)
> [info]   at 
> org.apache.spark.sql.util.DataFrameCallbackSuite.validateObservedMetrics(DataFrameCallbackSuite.scala:350)
> [info]   at 
> org.apache.spark.sql.util.DataFrameCallbackSuite.$anonfun$new$21(DataFrameCallbackSuite.scala:324)
> [info]   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> {code}
> I could not figure out how this happes. Hopefully the UT can help with 
> debugging



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37468) Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-25 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37468:
---
Description: Currently, UnionEstimation doesn't support ANSI intervals and 
TimestampNTZ. But I think it can support those types because their underlying 
types are integer or long, which UnionEstimation can compute stats for.  (was: 
Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. But 
I think it can support those types because their underlying types are integer 
or long, which it UnionEstimation can compute stats for.)

> Support ANSI intervals and TimestampNTZ for UnionEstimation
> ---
>
> Key: SPARK-37468
> URL: https://issues.apache.org/jira/browse/SPARK-37468
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. 
> But I think it can support those types because their underlying types are 
> integer or long, which UnionEstimation can compute stats for.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37468) Support ANSI intervals and TimestampNTZ for UnionEstimation

2021-11-25 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37468:
--

 Summary: Support ANSI intervals and TimestampNTZ for 
UnionEstimation
 Key: SPARK-37468
 URL: https://issues.apache.org/jira/browse/SPARK-37468
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. But 
I think it can support those types because their underlying types are integer 
or long, which it UnionEstimation can compute stats for.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37459) Upgrade commons-cli to 1.5.0

2021-11-24 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37459:
--

 Summary: Upgrade commons-cli to 1.5.0
 Key: SPARK-37459
 URL: https://issues.apache.org/jira/browse/SPARK-37459
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Currently used commons-cli is too old and contains an issue which affects the 
behavior of bin/spark-sql

{code}
bin/spark-sql -e 'SELECT "Spark"'
...
Error in query: 
no viable alternative at input 'SELECT "'(line 1, pos 7)

== SQL ==
SELECT "Spark
---^^^
{code}

The root cause of this issue seems to be resolved in CLI-185.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37354) Make the Java version installed on the container image used by the K8s integration tests with SBT configurable

2021-11-21 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37354.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34628

> Make the Java version installed on the container image used by the K8s 
> integration tests with SBT configurable
> --
>
> Key: SPARK-37354
> URL: https://issues.apache.org/jira/browse/SPARK-37354
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0
>
>
> I noticed that the default Java version installed on the container image used 
> by the K8s integration tests are different depending on the way to run the 
> tests.
> If the tests are launched by Maven, the Java version is 8 is installed.
> On the other hand, if the tests are launched by SBT, the Java version is 11.
> Further, we have no way to change the version.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37354) Make the Java version installed on the container image used by the K8s integration tests with SBT configurable

2021-11-17 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37354:
--

 Summary: Make the Java version installed on the container image 
used by the K8s integration tests with SBT configurable
 Key: SPARK-37354
 URL: https://issues.apache.org/jira/browse/SPARK-37354
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes, Tests
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


I noticed that the default Java version installed on the container image used 
by the K8s integration tests are different depending on the way to run the 
tests.

If the tests are launched by Maven, the Java version is 8 is installed.
On the other hand, if the tests are launched by SBT, the Java version is 11.
Further, we have no way to change the version.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37319) Support K8s image building with Java 17

2021-11-14 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37319.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34586

> Support K8s image building with Java 17
> ---
>
> Key: SPARK-37319
> URL: https://issues.apache.org/jira/browse/SPARK-37319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37320) Delete py_container_checks.zip after the test in DepsTestsSuite finishes

2021-11-14 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37320:
---
Description: 
When K8s integration tests run, py_container_checks.zip  still remains in 
resource-managers/kubernetes/integration-tests/tests/.
It's is created in the test "Launcher python client dependencies using a zip 
file" in DepsTestsSuite.

  was:
When K8s integration tests run, py_container_checks.zip is still remaining in 
resource-managers/kubernetes/integration-tests/tests/.
It's is created in the test "Launcher python client dependencies using a zip 
file" in DepsTestsSuite.


> Delete py_container_checks.zip after the test in DepsTestsSuite finishes
> 
>
> Key: SPARK-37320
> URL: https://issues.apache.org/jira/browse/SPARK-37320
> Project: Spark
>  Issue Type: Bug
>  Components: k8, Tests
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> When K8s integration tests run, py_container_checks.zip  still remains in 
> resource-managers/kubernetes/integration-tests/tests/.
> It's is created in the test "Launcher python client dependencies using a zip 
> file" in DepsTestsSuite.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37320) Delete py_container_checks.zip after the test in DepsTestsSuite finishes

2021-11-14 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37320:
--

 Summary: Delete py_container_checks.zip after the test in 
DepsTestsSuite finishes
 Key: SPARK-37320
 URL: https://issues.apache.org/jira/browse/SPARK-37320
 Project: Spark
  Issue Type: Bug
  Components: k8, Tests
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


When K8s integration tests run, py_container_checks.zip is still remaining in 
resource-managers/kubernetes/integration-tests/tests/.
It's is created in the test "Launcher python client dependencies using a zip 
file" in DepsTestsSuite.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37315) Mitigate ConcurrentModificationException thrown from a test in MLEventSuite

2021-11-13 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37315:
---
Summary: Mitigate ConcurrentModificationException thrown from a test in 
MLEventSuite  (was: Mitigate a ConcurrentModificationException thrown from a 
test in MLEventSuite)

> Mitigate ConcurrentModificationException thrown from a test in MLEventSuite
> ---
>
> Key: SPARK-37315
> URL: https://issues.apache.org/jira/browse/SPARK-37315
> Project: Spark
>  Issue Type: Bug
>  Components: ML, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> Recently, I notice ConcurrentModificationException is sometimes thrown from 
> the following part of the test "pipeline read/write events" in MLEventSuite 
> when Scala 2.13 is used.
> {code}
> events.map(JsonProtocol.sparkEventToJson).foreach { event =>
>   assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent])
> }
> {code}
> I think the root cause is the ArrayBuffer (events) is updated asynchronously 
> by the following part.
> {code}
> private val listener: SparkListener = new SparkListener {
>   override def onOtherEvent(event: SparkListenerEvent): Unit = event match {
> case e: MLEvent => events.append(e)
> case _ =>
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37315) Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite

2021-11-13 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37315:
---
Description: 
Recently, I notice ConcurrentModificationException is sometimes thrown from the 
following part of the test "pipeline read/write events" in MLEventSuite when 
Scala 2.13 is used.
{code}
events.map(JsonProtocol.sparkEventToJson).foreach { event =>
  assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent])
}
{code}

I think the root cause is the ArrayBuffer (events) is updated asynchronously by 
the following part.
{code}
private val listener: SparkListener = new SparkListener {
  override def onOtherEvent(event: SparkListenerEvent): Unit = event match {
case e: MLEvent => events.append(e)
case _ =>
  }
}
{code}

  was:
Recently, I notice ConcurrentModificationException is thrown from the following 
part of the test "pipeline read/write events" in MLEventSuite when Scala 2.13 
is used.
{code}
events.map(JsonProtocol.sparkEventToJson).foreach { event =>
  assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent])
}
{code}

I think the root cause is the ArrayBuffer (events) is updated asynchronously by 
the following part.
{code}
private val listener: SparkListener = new SparkListener {
  override def onOtherEvent(event: SparkListenerEvent): Unit = event match {
case e: MLEvent => events.append(e)
case _ =>
  }
}
{code}


> Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite
> -
>
> Key: SPARK-37315
> URL: https://issues.apache.org/jira/browse/SPARK-37315
> Project: Spark
>  Issue Type: Bug
>  Components: ML, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> Recently, I notice ConcurrentModificationException is sometimes thrown from 
> the following part of the test "pipeline read/write events" in MLEventSuite 
> when Scala 2.13 is used.
> {code}
> events.map(JsonProtocol.sparkEventToJson).foreach { event =>
>   assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent])
> }
> {code}
> I think the root cause is the ArrayBuffer (events) is updated asynchronously 
> by the following part.
> {code}
> private val listener: SparkListener = new SparkListener {
>   override def onOtherEvent(event: SparkListenerEvent): Unit = event match {
> case e: MLEvent => events.append(e)
> case _ =>
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37315) Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite

2021-11-13 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37315:
--

 Summary: Mitigate a ConcurrentModificationException thrown from a 
test in MLEventSuite
 Key: SPARK-37315
 URL: https://issues.apache.org/jira/browse/SPARK-37315
 Project: Spark
  Issue Type: Bug
  Components: ML, Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Recently, I notice ConcurrentModificationException is thrown from the following 
part of the test "pipeline read/write events" in MLEventSuite when Scala 2.13 
is used.
{code}
events.map(JsonProtocol.sparkEventToJson).foreach { event =>
  assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent])
}
{code}

I think the root cause is the ArrayBuffer (events) is updated asynchronously by 
the following part.
{code}
private val listener: SparkListener = new SparkListener {
  override def onOtherEvent(event: SparkListenerEvent): Unit = event match {
case e: MLEvent => events.append(e)
case _ =>
  }
}
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37312) Add `.java-version` to `.gitignore` and `.rat-excludes`

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37312.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34577

> Add `.java-version` to `.gitignore` and `.rat-excludes`
> ---
>
> Key: SPARK-37312
> URL: https://issues.apache.org/jira/browse/SPARK-37312
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Trivial
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37314:
---
Description: 
kubernetes-client 5.10.0 and 5.10.1 were released, which include some bug fixes.

https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1

Especially, the connection leak issue would affect Spark.
https://github.com/fabric8io/kubernetes-client/issues/3561

  was:
kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug fixes.

https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1

Especially, the connection leak issue would affect Spark.
https://github.com/fabric8io/kubernetes-client/issues/3561


> Upgrade kubernetes-client to 5.10.1
> ---
>
> Key: SPARK-37314
> URL: https://issues.apache.org/jira/browse/SPARK-37314
> Project: Spark
>  Issue Type: Bug
>  Components: Build, Kubernetes
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> kubernetes-client 5.10.0 and 5.10.1 were released, which include some bug 
> fixes.
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1
> Especially, the connection leak issue would affect Spark.
> https://github.com/fabric8io/kubernetes-client/issues/3561



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37314:
---
Description: 
kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug fixes.

https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1

Especially, the connection leak issue would affect Spark.
https://github.com/fabric8io/kubernetes-client/issues/3561

  was:
A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include 
some bug fixes.

https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1

Especially, the connection leak issue would affect Spark.
https://github.com/fabric8io/kubernetes-client/issues/3561


> Upgrade kubernetes-client to 5.10.1
> ---
>
> Key: SPARK-37314
> URL: https://issues.apache.org/jira/browse/SPARK-37314
> Project: Spark
>  Issue Type: Bug
>  Components: Build, Kubernetes
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug 
> fixes.
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1
> Especially, the connection leak issue would affect Spark.
> https://github.com/fabric8io/kubernetes-client/issues/3561



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37314:
---
Description: 
A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include 
some bug fixes.

https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1

Especially, the connection leak issue would affect Spark.
https://github.com/fabric8io/kubernetes-client/issues/3561

  was:
A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include 
some bug fixes.
Especially, the connection leak issue would affect Spark.


> Upgrade kubernetes-client to 5.10.1
> ---
>
> Key: SPARK-37314
> URL: https://issues.apache.org/jira/browse/SPARK-37314
> Project: Spark
>  Issue Type: Bug
>  Components: Build, Kubernetes
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which 
> include some bug fixes.
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0
> https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1
> Especially, the connection leak issue would affect Spark.
> https://github.com/fabric8io/kubernetes-client/issues/3561



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37314) Upgrade kubernetes-client to 5.10.1

2021-11-12 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37314:
--

 Summary: Upgrade kubernetes-client to 5.10.1
 Key: SPARK-37314
 URL: https://issues.apache.org/jira/browse/SPARK-37314
 Project: Spark
  Issue Type: Bug
  Components: Build, Kubernetes
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include 
some bug fixes.
Especially, the connection leak issue would affect Spark.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37302:
---
Description: 
dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in 
the both of Maven and Coursier local repository.
{code:java}
$ rm -rf ~/.m2/repository/*
$ # For Linux
$ rm -rf ~/.cache/coursier/v1/*
$ # For macOS
$ rm -rf ~/Library/Caches/Coursier/v1/*
$ dev/change-scala-version.sh 2.13
$ dev/test-dependencies.sh
$ build/sbt -Pscala-2.13 clean compile
...
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1:
  error: package com.google.common.primitives does not exist
[error] import com.google.common.primitives.Ints;
[error]^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1:
  error: package com.google.common.annotations does not exist
[error] import com.google.common.annotations.VisibleForTesting;
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1:
  error: package com.google.common.base does not exist
[error] import com.google.common.base.Preconditions;
...
{code}
{code:java}
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25:
 Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub.
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.HttpConnectionFactory)
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13:
 Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with 
a stub.
[error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), 
null)
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.ConnectionFactory)
[error] val connector = new ServerConnector(
{code}
The reason is that exec-maven-plugin used in test-dependencies.sh downloads pom 
of guava and jetty-io but doesn't downloads the corresponding jars, and skip 
dependency testing if Scala 2.13 is used (if dependency testing runs, Maven 
downloads those jars).
{code}
if [[ "$SCALA_BINARY_VERSION" != "2.12" ]]; then
  # TODO(SPARK-36168) Support Scala 2.13 in dev/test-dependencies.sh
  echo "Skip dependency testing on $SCALA_BINARY_VERSION"
  exit 0
fi
{code}
{code:java}
$ find ~/.m2

[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37302:
---
Description: 
dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in 
the both of Maven and Coursier local repository.
{code:java}
$ rm -rf ~/.m2/repository/*
$ # For Linux
$ rm -rf ~/.cache/coursier/v1/*
$ # For macOS
$ rm -rf ~/Library/Caches/Coursier/v1/*
$ dev/change-scala-version.sh 2.13
$ dev/test-dependencies.sh
$ build/sbt -Pscala-2.13 clean compile
...
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1:
  error: package com.google.common.primitives does not exist
[error] import com.google.common.primitives.Ints;
[error]^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1:
  error: package com.google.common.annotations does not exist
[error] import com.google.common.annotations.VisibleForTesting;
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1:
  error: package com.google.common.base does not exist
[error] import com.google.common.base.Preconditions;
...
{code}
{code:java}
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25:
 Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub.
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.HttpConnectionFactory)
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13:
 Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with 
a stub.
[error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), 
null)
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.ConnectionFactory)
[error] val connector = new ServerConnector(
{code}
The reason is that exec-maven-plugin used in test-dependencies.sh downloads pom 
of guava and jetty-io but doesn't downloads the corresponding jars, and skip 
dependency testing if Scala 2.13 is used (if dependency testing runs, Maven 
downloads those jars).
{code}
if [[ "$SCALA_BINARY_VERSION" != "2.12" ]]; then
  # TODO(SPARK-36168) Support Scala 2.13 in dev/test-dependencies.sh
  echo "Skip dependency testing on $SCALA_BINARY_VERSION"
  exit 0
fi
{code}
{code:java}
$ find ~/.m2

[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37302:
---
Description: 
dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in 
the both of Maven and Coursier local repository.
{code:java}
$ rm -rf ~/.m2/repository/*
$ # For Linux
$ rm -rf ~/.cache/coursier/v1/*
$ # For macOS
$ rm -rf ~/Library/Caches/Coursier/v1/*
$ dev/change-scala-version.sh 2.13
$ dev/test-dependencies.sh
$ build/sbt -Pscala-2.13 clean compile
...
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1:
  error: package com.google.common.primitives does not exist
[error] import com.google.common.primitives.Ints;
[error]^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1:
  error: package com.google.common.annotations does not exist
[error] import com.google.common.annotations.VisibleForTesting;
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1:
  error: package com.google.common.base does not exist
[error] import com.google.common.base.Preconditions;
...
{code}
{code:java}
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25:
 Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub.
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.HttpConnectionFactory)
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13:
 Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with 
a stub.
[error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), 
null)
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.ConnectionFactory)
[error] val connector = new ServerConnector(
{code}
The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads 
pom of guava and jetty-io but doesn't downloads the corresponding jars.
{code:java}
$ find ~/.m2 -name "guava*"
...
/home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom
/home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom.sha1
...
/home/kou/.m2/repository/com/google/guava/guava-parent/14.0.1/guava-parent-14.0.1.pom

[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh

2021-11-12 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37302:
---
Description: 
dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in 
the both of Maven and Coursier local repository.
{code:java}
$ rm -rf ~/.m2/repository/*
$ # For Linux
$ rm -rf ~/.cache/coursier/v1/*
$ # For macOS
$ rm -rf ~/Library/Caches/Coursier/v1/*
$ dev/change-scala-version.sh 2.13
$ dev/test-dependencies.sh
$ build/sbt -Pscala-2.13 clean compile
...
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1:
  error: package com.google.common.primitives does not exist
[error] import com.google.common.primitives.Ints;
[error]^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1:
  error: package com.google.common.annotations does not exist
[error] import com.google.common.annotations.VisibleForTesting;
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1:
  error: package com.google.common.base does not exist
[error] import com.google.common.base.Preconditions;
...
{code}
{code:java}
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25:
 Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub.
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.HttpConnectionFactory)
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13:
 Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with 
a stub.
[error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), 
null)
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.ConnectionFactory)
[error] val connector = new ServerConnector(
{code}
The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads 
pom of guava and jetty-io but doesn't downloads the corresponding jars.
{code:java}
$ find ~/.m2 -name "guava*"
...
/home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom
/home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom.sha1
...
/home/kou/.m2/repository/com/google/guava/guava-parent/14.0.1/guava-parent-14.0.1.pom

[jira] [Created] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh

2021-11-12 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37302:
--

 Summary: Explicitly download the dependencies of guava and 
jetty-io in test-dependencies.sh
 Key: SPARK-37302
 URL: https://issues.apache.org/jira/browse/SPARK-37302
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in 
the both of Maven and Coursier local repository.

{code}
$ rm -rf ~/.m2/repository/*
$ # For Linux
$ rm -rf ~/.cache/coursier/v1/*
$ # For macOS
$ rm -rf ~/Library/Caches/Coursier/v1/*
$ dev/change-scala-version.sh 2.13
$ dev/test-dependencies.sh
$ build/sbt -Pscala-2.13 clean compile
...
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1:
  error: package com.google.common.primitives does not exist
[error] import com.google.common.primitives.Ints;
[error]^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1:
  error: package com.google.common.annotations does not exist
[error] import com.google.common.annotations.VisibleForTesting;
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1:
  error: package com.google.common.base does not exist
[error] import com.google.common.base.Preconditions;
...
{code}
{code}
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25:
 Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub.
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.HttpConnectionFactory)
[error] val connector = new ServerConnector(
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13:
 Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with 
a stub.
[error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), 
null)
[error] ^
[error] 
/home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25:
 multiple constructors for ServerConnector with alternatives:
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: 
org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.util.ssl.SslContextFactory,x$3: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
 
[error]   (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: 
org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector
[error]  cannot be invoked with (org.eclipse.jetty.server.Server, Null, 
org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, 
org.eclipse.jetty.server.ConnectionFactory)
[error] val connector = new ServerConnector(
{code}


The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads 
pom of guava and jetty-io but doesn't downloads the corresponding jars.

{code}
$ find ~/.m2 -name "guava*"
...

[jira] [Created] (SPARK-37284) Upgrade Jekyll to 4.2.1

2021-11-10 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37284:
--

 Summary: Upgrade Jekyll to 4.2.1
 Key: SPARK-37284
 URL: https://issues.apache.org/jira/browse/SPARK-37284
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Jekyll 4.2.1 was released in September, which includes the fix of a regression 
bug.
https://github.com/jekyll/jekyll/releases/tag/v4.2.1



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37283) Don't try to store a V1 table which contains ANSI intervals in Hive compatible format

2021-11-10 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37283:
---
Description: 
If, a table being created contains a column of ANSI interval types and the 
underlying file format has a corresponding Hive SerDe (e.g. Parquet),
`HiveExternalcatalog` tries to store the table in Hive compatible format.
But, as ANSI interval types in Spark and interval type in Hive are not 
compatible (Hive only supports interval_year_month and interval_day_time), the 
following warning with stack trace will be logged.

{code}
spark-sql> CREATE TABLE tbl1(a INTERVAL YEAR TO MONTH) USING Parquet;
21/11/11 14:39:29 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, 
since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
21/11/11 14:39:29 WARN HiveExternalCatalog: Could not persist `default`.`tbl1` 
in a Hive compatible way. Persisting it into Hive metastore in Spark SQL 
specific format.
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.IllegalArgumentException: Error: type expected at the position 0 of 
'interval year to month' but 'interval year to month' is found.
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:874)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createTable$1(HiveClientImpl.scala:553)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:303)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:234)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:233)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:283)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:551)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.saveTableIntoHive(HiveExternalCatalog.scala:499)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createDataSourceTable(HiveExternalCatalog.scala:397)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createTable$1(HiveExternalCatalog.scala:274)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:376)
at 
org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:120)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at

[jira] [Created] (SPARK-37283) Don't try to store a V1 table which contains ANSI intervals in Hive compatible format

2021-11-10 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37283:
--

 Summary: Don't try to store a V1 table which contains ANSI 
intervals in Hive compatible format
 Key: SPARK-37283
 URL: https://issues.apache.org/jira/browse/SPARK-37283
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


If, a table being created contains a column of ANSI interval types and the 
underlying file format has a corresponding Hive SerDe (e.g. Parquet),
`HiveExternalcatalog` tries to store the table in Hive compatible format.
But, as ANSI interval types in Spark and interval type in Hive are not 
compatible (Hive only supports interval_year_month and interval_day_time), the 
following warning with stack trace will be logged.

{code}
spark-sql> CREATE TABLE tbl1(a INTERVAL YEAR TO MONTH) USING Parquet;
21/11/11 14:39:29 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, 
since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
21/11/11 14:39:29 WARN HiveExternalCatalog: Could not persist `default`.`tbl1` 
in a Hive compatible way. Persisting it into Hive metastore in Spark SQL 
specific format.
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.IllegalArgumentException: Error: type expected at the position 0 of 
'interval year to month' but 'interval year to month' is found.
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:874)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createTable$1(HiveClientImpl.scala:553)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:303)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:234)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:233)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:283)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:551)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.saveTableIntoHive(HiveExternalCatalog.scala:499)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createDataSourceTable(HiveExternalCatalog.scala:397)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createTable$1(HiveExternalCatalog.scala:274)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:376)
at 
org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:120)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97)
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
at

[jira] [Resolved] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37264.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34541

> [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from 
> orc-core
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> Like hadoop-common and hadoop-hdfs, this PR proposes to exclude 
> hadoop-client-api transitive dependency from orc-core.
> Why are the changes needed?
> Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency 
> on Hadoop 3.3.1.
> This causes test-dependencies.sh failure on Java 17. As a result, 
> run-tests.py also fails.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37264:
---
Description: 
Like hadoop-common and hadoop-hdfs, this PR proposes to exclude 
hadoop-client-api transitive dependency from orc-core.
Why are the changes needed?

Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency on 
Hadoop 3.3.1.
This causes test-dependencies.sh failure on Java 17. As a result, run-tests.py 
also fails.

  was:
In the current master, `run-tests.py` fails on Java 17 due to 
`test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
dependency on hadoop-client-api:3.3.1 only for Java 17.
Hadoop 2.7 doesn't support Java 17 so let's 


> [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from 
> orc-core
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> Like hadoop-common and hadoop-hdfs, this PR proposes to exclude 
> hadoop-client-api transitive dependency from orc-core.
> Why are the changes needed?
> Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency 
> on Hadoop 3.3.1.
> This causes test-dependencies.sh failure on Java 17. As a result, 
> run-tests.py also fails.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37264) Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37264:
---
Description: 
In the current master, `run-tests.py` fails on Java 17 due to 
`test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
dependency on hadoop-client-api:3.3.1 only for Java 17.
Hadoop 2.7 doesn't support Java 17 so let's 

  was:
In the current master, `run-tests.py` fails on Java 17 due to 
`test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
dependency on hadoop-client-api:3.3.1 only for Java 17.
Currently, we don't maintain the dependency manifests for Java 17 yet so let's 
skip it temporarily.


> Cut the transitive dependency on hadoop-client-api which orc-shims depends on 
> only for Java 17 with hadoop-2.7
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> In the current master, `run-tests.py` fails on Java 17 due to 
> `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
> dependency on hadoop-client-api:3.3.1 only for Java 17.
> Hadoop 2.7 doesn't support Java 17 so let's 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37264:
---
Summary: [SPARK-37264][BUILD] Exclude hadoop-client-api transitive 
dependency from orc-core  (was: Cut the transitive dependency on 
hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7)

> [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from 
> orc-core
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> In the current master, `run-tests.py` fails on Java 17 due to 
> `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
> dependency on hadoop-client-api:3.3.1 only for Java 17.
> Hadoop 2.7 doesn't support Java 17 so let's 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37264) Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37264:
---
Summary: Cut the transitive dependency on hadoop-client-api which orc-shims 
depends on only for Java 17 with hadoop-2.7  (was: Skip dependency testing on 
Java 17 temporarily)

> Cut the transitive dependency on hadoop-client-api which orc-shims depends on 
> only for Java 17 with hadoop-2.7
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> In the current master, `run-tests.py` fails on Java 17 due to 
> `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
> dependency on hadoop-client-api:3.3.1 only for Java 17.
> Currently, we don't maintain the dependency manifests for Java 17 yet so 
> let's skip it temporarily.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37264) Skip dependency testing on Java 17 temporarily

2021-11-09 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37264:
---
Description: 
In the current master, `run-tests.py` fails on Java 17 due to 
`test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
dependency on hadoop-client-api:3.3.1 only for Java 17.
Currently, we don't maintain the dependency manifests for Java 17 yet so let's 
skip it temporarily.

  was:
In the current master, test-dependencies.sh fails on Java 17 because 
orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for 
Java 17.

Currently, we don't maintain the dependency manifests for Java 17 yet so let's 
skip it temporarily.


> Skip dependency testing on Java 17 temporarily
> --
>
> Key: SPARK-37264
> URL: https://issues.apache.org/jira/browse/SPARK-37264
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
>
> In the current master, `run-tests.py` fails on Java 17 due to 
> `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile 
> dependency on hadoop-client-api:3.3.1 only for Java 17.
> Currently, we don't maintain the dependency manifests for Java 17 yet so 
> let's skip it temporarily.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37265) Support Java 17 in `dev/test-dependencies.sh`

2021-11-09 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37265:
--

 Summary: Support Java 17 in `dev/test-dependencies.sh`
 Key: SPARK-37265
 URL: https://issues.apache.org/jira/browse/SPARK-37265
 Project: Spark
  Issue Type: Sub-task
  Components: Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37264) Skip dependency testing on Java 17 temporarily

2021-11-09 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37264:
--

 Summary: Skip dependency testing on Java 17 temporarily
 Key: SPARK-37264
 URL: https://issues.apache.org/jira/browse/SPARK-37264
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


In the current master, test-dependencies.sh fails on Java 17 because 
orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for 
Java 17.

Currently, we don't maintain the dependency manifests for Java 17 yet so let's 
skip it temporarily.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] (SPARK-36895) Add Create Index syntax support

2021-11-08 Thread Kousuke Saruta (Jira)



[ https://issues.apache.org/jira/browse/SPARK-36895 ]


Kousuke Saruta deleted comment on SPARK-36895:


was (Author: sarutak):
The change in https://github.com/apache/spark/pull/34148 was reverted and 
resolved again in https://github.com/apache/spark/pull/34523

> Add Create Index syntax support
> ---
>
> Key: SPARK-36895
> URL: https://issues.apache.org/jira/browse/SPARK-36895
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36895) Add Create Index syntax support

2021-11-08 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440696#comment-17440696
 ] 

Kousuke Saruta commented on SPARK-36895:


The change in https://github.com/apache/spark/pull/34148 was reverted and 
resolved again in https://github.com/apache/spark/pull/34523

> Add Create Index syntax support
> ---
>
> Key: SPARK-36895
> URL: https://issues.apache.org/jira/browse/SPARK-36895
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37240) Cannot read partitioned parquet files with ANSI interval partition values

2021-11-08 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37240.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34517

> Cannot read partitioned parquet files with ANSI interval partition values
> -
>
> Key: SPARK-37240
> URL: https://issues.apache.org/jira/browse/SPARK-37240
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
> Fix For: 3.3.0
>
>
> The code below demonstrates the issue:
> {code:scala}
> scala> sql("SELECT INTERVAL '1' YEAR AS i, 0 as 
> id").write.partitionBy("i").parquet("/Users/maximgekk/tmp/ansi_interval_parquet")
> scala> spark.read.schema("i INTERVAL YEAR, id 
> INT").parquet("/Users/maximgekk/tmp/ansi_interval_parquet").show(false)
> 21/11/08 10:56:36 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
> java.lang.RuntimeException: DataType INTERVAL YEAR is not supported in column 
> vectorized reader.
>   at 
> org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:100)
>   at 
> org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:243)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-36038) Basic speculation metrics at stage level

2021-11-08 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta reopened SPARK-36038:

  Assignee: (was: Venkata krishnan Sowrirajan)

The change was reverted.
https://github.com/apache/spark/pull/34518

So I re-open this.

> Basic speculation metrics at stage level
> 
>
> Key: SPARK-36038
> URL: https://issues.apache.org/jira/browse/SPARK-36038
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Venkata krishnan Sowrirajan
>Priority: Major
> Fix For: 3.3.0
>
>
> Currently there are no speculation metrics available either at application 
> level or at stage level. With in our platform, we have added speculation 
> metrics at stage level as a summary similarly to the stage level metrics 
> tracking numTotalSpeculated, numCompleted (successful), numFailed, numKilled 
> etc. This enables us to effectively understand speculative execution feature 
> at an application level and helps in further tuning the speculation configs.
> cc [~ron8hu]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37158) Add doc about spark not supported hive built-in function

2021-11-07 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37158.

Resolution: Won't Fix

See the discussion.
https://github.com/apache/spark/pull/34434#issuecomment-954545315

> Add doc about spark not supported hive built-in function
> 
>
> Key: SPARK-37158
> URL: https://issues.apache.org/jira/browse/SPARK-37158
> Project: Spark
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> Add doc about spark not supported hive built-in function



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37238) Upgrade ORC to 1.6.12

2021-11-07 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37238.

Fix Version/s: 3.2.1
 Assignee: Dongjoon Hyun
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34512

> Upgrade ORC to 1.6.12
> -
>
> Key: SPARK-37238
> URL: https://issues.apache.org/jira/browse/SPARK-37238
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.2.1
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.2.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37211) More descriptions and adding an image to the failure message about enabling GitHub Actions

2021-11-07 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37211.

Fix Version/s: 3.3.0
 Assignee: Yuto Akutsu
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34487

> More descriptions and adding an image to the failure message about enabling 
> GitHub Actions
> --
>
> Key: SPARK-37211
> URL: https://issues.apache.org/jira/browse/SPARK-37211
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 3.3.0
>Reporter: Yuto Akutsu
>Assignee: Yuto Akutsu
>Priority: Minor
> Fix For: 3.3.0
>
>
> I've seen and experienced that the build-and-test workflow of first-time PRs 
> fails and it was caused by developers forgetting to enable Github Actions on 
> their own repositories.
> I think developers will be able to notice the cause quicker by adding more 
> descriptions and an image to the test-failure message.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37231) Dynamic writes/reads of ANSI interval partitions

2021-11-07 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37231.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34506

> Dynamic writes/reads of ANSI interval partitions
> 
>
> Key: SPARK-37231
> URL: https://issues.apache.org/jira/browse/SPARK-37231
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
> Fix For: 3.3.0
>
>
> Check and fix if it's needed dynamic partitions writes of ANSI intervals.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35496) Upgrade Scala 2.13 to 2.13.7

2021-11-04 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438986#comment-17438986
 ] 

Kousuke Saruta commented on SPARK-35496:


[~dongjoon]
Thank you for letting me know. That's great.

> Upgrade Scala 2.13 to 2.13.7
> 
>
> Key: SPARK-35496
> URL: https://issues.apache.org/jira/browse/SPARK-35496
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> This issue aims to upgrade to Scala 2.13.7.
> Scala 2.13.6 released（https://github.com/scala/scala/releases/tag/v2.13.6）. 
> However, we skip 2.13.6 because there is a breaking behavior change at 2.13.6 
> which is different from both Scala 2.13.5 and Scala 3.
> - https://github.com/scala/bug/issues/12403
> {code}
> scala3-3.0.0:$ bin/scala
> scala> Array.empty[Double].intersect(Array(0.0))
> val res0: Array[Double] = Array()
> scala-2.13.6:$ bin/scala
> Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_292).
> Type in expressions for evaluation. Or try :help.
> scala> Array.empty[Double].intersect(Array(0.0))
> java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D
>   ... 32 elided
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35496) Upgrade Scala 2.13 to 2.13.7

2021-11-04 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438535#comment-17438535
 ] 

Kousuke Saruta commented on SPARK-35496:


[~LuciferYang] Scala 2.13.7 was released a few days ago.
https://github.com/scala/scala/releases/tag/v2.13.7
Would you like to continue to work on this?

> Upgrade Scala 2.13 to 2.13.7
> 
>
> Key: SPARK-35496
> URL: https://issues.apache.org/jira/browse/SPARK-35496
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> This issue aims to upgrade to Scala 2.13.7.
> Scala 2.13.6 released（https://github.com/scala/scala/releases/tag/v2.13.6）. 
> However, we skip 2.13.6 because there is a breaking behavior change at 2.13.6 
> which is different from both Scala 2.13.5 and Scala 3.
> - https://github.com/scala/bug/issues/12403
> {code}
> scala3-3.0.0:$ bin/scala
> scala> Array.empty[Double].intersect(Array(0.0))
> val res0: Array[Double] = Array()
> scala-2.13.6:$ bin/scala
> Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_292).
> Type in expressions for evaluation. Or try :help.
> scala> Array.empty[Double].intersect(Array(0.0))
> java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D
>   ... 32 elided
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37206) Upgrade Avro to 1.11.0

2021-11-03 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37206:
--

 Summary: Upgrade Avro to 1.11.0
 Key: SPARK-37206
 URL: https://issues.apache.org/jira/browse/SPARK-37206
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


Recently, Avro 1.1.0 was released which includes bunch of bug fixes.
https://issues.apache.org/jira/issues/?jql=project%3DAVRO%20AND%20fixVersion%3D1.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37108) Expose make_date expression in R

2021-11-03 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37108.

Fix Version/s: 3.3.0
 Assignee: Leona Yoda
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34480

> Expose make_date expression in R
> 
>
> Key: SPARK-37108
> URL: https://issues.apache.org/jira/browse/SPARK-37108
> Project: Spark
>  Issue Type: Improvement
>  Components: R
>Affects Versions: 3.3.0
>Reporter: Leona Yoda
>Assignee: Leona Yoda
>Priority: Minor
> Fix For: 3.3.0
>
>
> Expose make_date API on SparkR.
>  
> (cf. https://issues.apache.org/jira/browse/SPARK-36554)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-11-01 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37159.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34425

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28

[jira] [Resolved] (SPARK-36554) Error message while trying to use spark sql functions directly on dataframe columns without using select expression

2021-11-01 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-36554.

Fix Version/s: 3.3.0
 Assignee: Nicolas Azrak
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/34356

> Error message while trying to use spark sql functions directly on dataframe 
> columns without using select expression
> ---
>
> Key: SPARK-36554
> URL: https://issues.apache.org/jira/browse/SPARK-36554
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, Examples, PySpark
>Affects Versions: 3.1.1
>Reporter: Lekshmi Ramachandran
>Assignee: Nicolas Azrak
>Priority: Minor
>  Labels: documentation, features, functions, spark-sql
> Fix For: 3.3.0
>
> Attachments: Screen Shot .png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The below code generates a dataframe successfully . Here make_date function 
> is used inside a select expression
>  
> from pyspark.sql.functions import  expr, make_date
> df = spark.createDataFrame([(2020, 6, 26), (1000, 2, 29), (-44, 1, 1)],['Y', 
> 'M', 'D'])
> df.select("*",expr("make_date(Y,M,D) as lk")).show()
>  
> The below code fails with a message "cannot import name 'make_date' from 
> 'pyspark.sql.functions'" . Here the make_date function is directly called on 
> dataframe columns without select expression
>  
> from pyspark.sql.functions import make_date
> df = spark.createDataFrame([(2020, 6, 26), (1000, 2, 29), (-44, 1, 1)],['Y', 
> 'M', 'D'])
> df.select(make_date(df.Y,df.M,df.D).alias("datefield")).show()
>  
> The error message generated is misleading when it says "cannot  import 
> make_date from pyspark.sql.functions"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37170) Pin PySpark version installed in the Binder environment for tagged commit

2021-10-31 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37170:
---
Summary: Pin PySpark version installed in the Binder environment for tagged 
commit  (was: Pin PySpark version for Binder)

> Pin PySpark version installed in the Binder environment for tagged commit
> -
>
> Key: SPARK-37170
> URL: https://issues.apache.org/jira/browse/SPARK-37170
> Project: Spark
>  Issue Type: Bug
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Apache Spark
>Priority: Major
>
> I noticed that the PySpark 3.1.2 is installed in the live notebook 
> environment even though the notebook is for PySpark 3.2.0.
> http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html
> I guess someone accessed to Binder and built the container image with v3.2.0 
> before we published the pyspark package to PyPi.
> https://mybinder.org/
> I think it's difficult to rebuild the image manually.
> To avoid such accident, I'll propose to pin the version of PySpark in 
> binder/postBuild
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37170) Pin PySpark version for Binder

2021-10-30 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37170:
---
Description: 
I noticed that the PySpark 3.1.2 is installed in the live notebook environment 
even though the notebook is for PySpark 3.2.0.
http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html

I guess someone accessed to Binder and built the container image with v3.2.0 
before we published the pyspark package to PyPi.
https://mybinder.org/

I think it's difficult to rebuild the image manually.
To avoid such accident, I'll propose to pin the version of PySpark in 
binder/postBuild

 

 

  was:
I noticed that the PySpark 3.1.2 is installed in the live notebook environment 
even though the notebook is for PySpark 3.2.
http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html

I guess someone accessed to Binder and built the container image with v3.2.0 
before we published the pyspark package to PyPi.
https://mybinder.org/

I think it's difficult to rebuild the image manually.
To avoid such accident, I'll propose to pin the version of PySpark in 
binder/postBuild

 

 


> Pin PySpark version for Binder
> --
>
> Key: SPARK-37170
> URL: https://issues.apache.org/jira/browse/SPARK-37170
> Project: Spark
>  Issue Type: Bug
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> I noticed that the PySpark 3.1.2 is installed in the live notebook 
> environment even though the notebook is for PySpark 3.2.0.
> http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html
> I guess someone accessed to Binder and built the container image with v3.2.0 
> before we published the pyspark package to PyPi.
> https://mybinder.org/
> I think it's difficult to rebuild the image manually.
> To avoid such accident, I'll propose to pin the version of PySpark in 
> binder/postBuild
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37170) Pin PySpark version for Binder

2021-10-30 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-37170:
---
Description: 
I noticed that the PySpark 3.1.2 is installed in the live notebook environment 
even though the notebook is for PySpark 3.2.
http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html

I guess someone accessed to Binder and built the container image with v3.2.0 
before we published the pyspark package to PyPi.
https://mybinder.org/

I think it's difficult to rebuild the image manually.
To avoid such accident, I'll propose to pin the version of PySpark in 
binder/postBuild

 

 

  was:
I noticed that the PySpark 3.1.2 is installed in the environment of live 
notebook even though the notebook is for PySpark 3.2.
http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html

I guess someone accessed to Binder and built the container image with v3.2.0 
before we published the pyspark package to PyPi.
https://mybinder.org/

I think it's difficult to rebuild the image manually.
To avoid such accident, I'll propose to pin the version of PySpark in 
binder/postBuild

 

 


> Pin PySpark version for Binder
> --
>
> Key: SPARK-37170
> URL: https://issues.apache.org/jira/browse/SPARK-37170
> Project: Spark
>  Issue Type: Bug
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>
> I noticed that the PySpark 3.1.2 is installed in the live notebook 
> environment even though the notebook is for PySpark 3.2.
> http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html
> I guess someone accessed to Binder and built the container image with v3.2.0 
> before we published the pyspark package to PyPi.
> https://mybinder.org/
> I think it's difficult to rebuild the image manually.
> To avoid such accident, I'll propose to pin the version of PySpark in 
> binder/postBuild
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37170) Pin PySpark version for Binder

2021-10-30 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37170:
--

 Summary: Pin PySpark version for Binder
 Key: SPARK-37170
 URL: https://issues.apache.org/jira/browse/SPARK-37170
 Project: Spark
  Issue Type: Bug
  Components: docs, PySpark
Affects Versions: 3.2.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


I noticed that the PySpark 3.1.2 is installed in the environment of live 
notebook even though the notebook is for PySpark 3.2.
http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html

I guess someone accessed to Binder and built the container image with v3.2.0 
before we published the pyspark package to PyPi.
https://mybinder.org/

I think it's difficult to rebuild the image manually.
To avoid such accident, I'll propose to pin the version of PySpark in 
binder/postBuild

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-10-29 Thread Kousuke Saruta (Jira)

Kousuke Saruta created SPARK-37159:
--

 Summary: Change HiveExternalCatalogVersionsSuite to be able to 
test with Java 17
 Key: SPARK-37159
 URL: https://issues.apache.org/jira/browse/SPARK-37159
 Project: Spark
  Issue Type: Bug
  Components: SQL, Tests
Affects Versions: 3.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta


SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
`HiveExternalCatalogVersionsSuite`.

{code}
[info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
*** (42 seconds, 526 milliseconds)
[info]   spark-submit returned with exit code 1.
[info]   Command line: 
'/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
 '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
'spark.sql.hive.metastore.version=2.3' '--conf' 
'spark.sql.hive.metastore.jars=maven' '--conf' 
'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
 '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
'-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
 
'/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
[info]   
[info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
[info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO SparkContext: 
Running Spark version 3.2.0
[info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
NativeCodeLoader: Unable to load native-hadoop library for your platform... 
using builtin-java classes where applicable
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: ==
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: No custom resources configured for spark.driver.
[info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
ResourceUtils: ==
[info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO SparkContext: 
Submitted application: prepare testing tables
[info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfile: Default ResourceProfile created, executor resources: Map(cores 
-> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 
1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , 
vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
[info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfile: Limiting resource is cpu
[info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
ResourceProfileManager: Added ResourceProfile id: 0
[info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing view acls to: kou
[info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing modify acls to: kou
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing view acls groups to: 
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: Changing modify acls groups to: 
[info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
users  with view permissions: Set(kou); groups with view permissions: Set(); 
users  with modify permissions: Set(kou); groups with modify permissions: Set()
[info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
Successfully started service 'sparkDriver' on port 35867.
[info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
Registering MapOutputTracker
[info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
Registering BlockManagerMaster
[info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
[info]   2021-10-28 06:07:18.944 - stderr> 21/10/28 22:07:18 INFO 
BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
[info]   2021-10-28 06:07:18.945 - stdout> Traceback (most recent call last):
[info]   2021-10-28 06:07:18.946 - stdout>   File

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1125 matches

Mail list logo