date:20231123

[jira] [Created] (SPARK-46088) Add a self-contained example about creating dataframe from jdbc

2023-11-23 Thread BingKun Pan (Jira)

BingKun Pan created SPARK-46088:
---

 Summary: Add a self-contained example about creating dataframe 
from jdbc
 Key: SPARK-46088
 URL: https://issues.apache.org/jira/browse/SPARK-46088
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: BingKun Pan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46054) SPIP: Proposal to Adopt Google's Spark K8s Operator as Official Spark Operator

2023-11-23 Thread Vara Bonthu (Jira)

[
https://issues.apache.org/jira/browse/SPARK-46054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vara Bonthu updated SPARK-46054:

Description:
*Description:*

This proposal aims to recommend the adoption of [Google's Spark K8s
Operator|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator] as the
official Spark Operator for the Apache Spark community. The operator has gained
significant traction among many users and organizations and used heavily in
production environments, but challenges related to maintenance and governance
necessitate this recommendation.

*Background:*
* Google's Spark K8s Operator is currently in use by hundreds of users and
organizations. However, due to maintenance issues, many of these users and
organizations have resorted to forking the repository and implementing their
own fixes.

* The project boasts an impressive user base with 167 contributors, 2.5k
likes, and endorsements from 45 organizations, as documented in the "Who is
using" document. Notably, there are many more organizations using it than the
initially reported 45.

* The primary issue at hand is that this project resides under the
GoogleCloudPlatform GitHub organization and is exclusively moderated by a
Google employee. Concerns have been raised by numerous users and customers
regarding the maintenance of the repository.

* The existing Google maintainers are constrained by limitations in terms of
time and support, which negatively impacts both the project and its user
community.

*Recent Developments:*
* During Kubecon Chicago 2023, AWS OSS Architects (Vara Bonthu) and the Apple
infrastructure team engaged in discussions with the Google's team, specifically
with Marcin Wielgus. They expressed their interest in contributing the project
to either the Kubeflow or Apache Spark community.

* *{color:#00875a}Marcin from Google confirmed their willingness to donate the
project to either of these communities.{color}*

* An adoption process has been initiated by the Kubeflow project under CNCF,
as documented in the following thread: [Link to the
thread|https://github.com/kubeflow/community/issues/648].

*Primary Goal:*
* The primary goal is to ensure the collaborative support and adoption of
Google's Spark Operator by the Apache Spark , thereby avoiding the development
of redundant tools and reducing confusion among users.

*Next Steps:*
* *Meeting with Apache Spark Working Group Maintainers:* We propose arranging
a meeting with the Apache Spark working group maintainers to delve deeper into
this matter, address any questions or concerns they may have, and collectively
work towards a decision.

* *Establish a New Working Group:* Upon reaching an agreement, we intend to
create a new working group comprising members from diverse organizations who
are willing to contribute and collaborate on this initiative.

* *Repository Transfer:* Our plan involves transferring the project repository
from Google's organization to either the Apache or Kubeflow organization,
aligning with the chosen community.

* *Roadmap Development:* We will formulate a new roadmap that encompasses
immediate issue resolution and a long-term design strategy aimed at enhancing
performance, scalability, and security for this tool.

We believe that working towards one Spark Operator will benefit the Apache
Spark community and address the current maintenance challenges. Your feedback
and support in this matter are highly valued. Let's collaborate to ensure a
robust and well-maintained Spark Operator for the Apache Spark community's
benefit.

*Community members are encouraged to leave their comments or give a thumbs-up
to express their support for adopting Google's Spark Operator as the official
Apache Spark operator.*

*Proposed Authors*

Vara Bonthu (AWS)

Marcin Wielgus (Google)

was:
*Description:*

* The primary issue at hand is that this project resides under the
GoogleCloudPlatform GitHub

[jira] [Resolved] (SPARK-46084) Refactor data type casting operation for Categorical type.

2023-11-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-46084.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43993
[https://github.com/apache/spark/pull/43993]

> Refactor data type casting operation for Categorical type.
> --
>
> Key: SPARK-46084
> URL: https://issues.apache.org/jira/browse/SPARK-46084
> Project: Spark
>  Issue Type: Bug
>  Components: PS
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Using official API for better performance and readability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46073) Remove the special resolution of UnresolvedNamespace for certain commands

2023-11-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-46073.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43980
[https://github.com/apache/spark/pull/43980]

> Remove the special resolution of UnresolvedNamespace for certain commands
> -
>
> Key: SPARK-46073
> URL: https://issues.apache.org/jira/browse/SPARK-46073
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46083) Make SparkNoSuchElementException as a canonical error API

2023-11-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-46083:
-

Assignee: Hyukjin Kwon

> Make SparkNoSuchElementException as a canonical error API
> -
>
> Key: SPARK-46083
> URL: https://issues.apache.org/jira/browse/SPARK-46083
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
>
> https://github.com/apache/spark/pull/43927 added SparkNoSuchElementException. 
> It should be a canonical error API, documented properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46083) Make SparkNoSuchElementException as a canonical error API

2023-11-23 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-46083.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43992
[https://github.com/apache/spark/pull/43992]

> Make SparkNoSuchElementException as a canonical error API
> -
>
> Key: SPARK-46083
> URL: https://issues.apache.org/jira/browse/SPARK-46083
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/apache/spark/pull/43927 added SparkNoSuchElementException. 
> It should be a canonical error API, documented properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

93 matches

Mail list logo