date:20231126

[jira] [Updated] (SPARK-46119) Override toString method for UnresolvedAlias

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46119:
---
Labels: pull-request-available  (was: )

> Override toString method for UnresolvedAlias
> 
>
> Key: SPARK-46119
> URL: https://issues.apache.org/jira/browse/SPARK-46119
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46120) Remove helper function DataFrame.withPlan

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46120:
---
Labels: pull-request-available  (was: )

> Remove helper function DataFrame.withPlan
> -
>
> Key: SPARK-46120
> URL: https://issues.apache.org/jira/browse/SPARK-46120
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46120) Remove helper function DataFrame.withPlan

2023-11-26 Thread Ruifeng Zheng (Jira)

Ruifeng Zheng created SPARK-46120:
-

 Summary: Remove helper function DataFrame.withPlan
 Key: SPARK-46120
 URL: https://issues.apache.org/jira/browse/SPARK-46120
 Project: Spark
  Issue Type: Bug
  Components: Connect, PySpark
Affects Versions: 4.0.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46119) Override toString method for UnresolvedAlias

2023-11-26 Thread Yuming Wang (Jira)

Yuming Wang created SPARK-46119:
---

 Summary: Override toString method for UnresolvedAlias
 Key: SPARK-46119
 URL: https://issues.apache.org/jira/browse/SPARK-46119
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Yuming Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46117) Enhancing readability of PySpark API reference by hiding verbose typehints.

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46117:
---
Labels: pull-request-available  (was: )

> Enhancing readability of PySpark API reference by hiding verbose typehints.
> ---
>
> Key: SPARK-46117
> URL: https://issues.apache.org/jira/browse/SPARK-46117
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the PySpark API documentation displays all type hints in the 
> signatures, which can make the documentation appear cluttered and less 
> readable. By setting `autodoc_typehints` to 'none', we can achieve a cleaner 
> and more concise presentation of our API, similar to how the Pandas 
> documentation handles type hints. This approach has been effective in Pandas, 
> making the documentation more approachable and easier to understand, 
> especially for newcomers. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46118) Use `SparkSession.sessionState.conf` instead of `sqlContext.conf`

2023-11-26 Thread Yang Jie (Jira)

Yang Jie created SPARK-46118:


 Summary: Use `SparkSession.sessionState.conf` instead of 
`sqlContext.conf`
 Key: SPARK-46118
 URL: https://issues.apache.org/jira/browse/SPARK-46118
 Project: Spark
  Issue Type: Improvement
  Components: Connect, SQL
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46118) Use `SparkSession.sessionState.conf` instead of `sqlContext.conf`

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46118:
---
Labels: pull-request-available  (was: )

> Use `SparkSession.sessionState.conf` instead of `sqlContext.conf`
> -
>
> Key: SPARK-46118
> URL: https://issues.apache.org/jira/browse/SPARK-46118
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46117) Enhancing readability of PySpark API reference by hiding verbose typehints.

2023-11-26 Thread Haejoon Lee (Jira)

Haejoon Lee created SPARK-46117:
---

 Summary: Enhancing readability of PySpark API reference by hiding 
verbose typehints.
 Key: SPARK-46117
 URL: https://issues.apache.org/jira/browse/SPARK-46117
 Project: Spark
  Issue Type: Bug
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Haejoon Lee


Currently, the PySpark API documentation displays all type hints in the 
signatures, which can make the documentation appear cluttered and less 
readable. By setting `autodoc_typehints` to 'none', we can achieve a cleaner 
and more concise presentation of our API, similar to how the Pandas 
documentation handles type hints. This approach has been effective in Pandas, 
making the documentation more approachable and easier to understand, especially 
for newcomers. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46114) Define IndexError for PySpark error framework

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46114:


Assignee: Hyukjin Kwon

> Define IndexError for PySpark error framework
> -
>
> Key: SPARK-46114
> URL: https://issues.apache.org/jira/browse/SPARK-46114
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46114) Define IndexError for PySpark error framework

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46114.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44028
[https://github.com/apache/spark/pull/44028]

> Define IndexError for PySpark error framework
> -
>
> Key: SPARK-46114
> URL: https://issues.apache.org/jira/browse/SPARK-46114
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Description: 
It is aimed at improving user engagement and providing quick access to 
community support and discussions. This approach is inspired by the [Pandas 
documentation](https://pandas.pydata.org/docs/index.html), which effectively 
uses a similar section for community engagement.

The "Q Support" will lead users to a curated list of StackOverflow questions 
tagged with `pyspark`, while the mailing lists will offer platforms for deeper 
discussions and insights within the Spark community.

  was:The addition of the "Q Support" link provides quick access to the 
community-driven Q platform, StackOverflow, where users can seek help and 
contribute to discussions about PySpark. It enhances the user experience by 
connecting the documentation with a dynamic and interactive community resource. 


> Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.
> 
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> It is aimed at improving user engagement and providing quick access to 
> community support and discussions. This approach is inspired by the [Pandas 
> documentation](https://pandas.pydata.org/docs/index.html), which effectively 
> uses a similar section for community engagement.
> The "Q Support" will lead users to a curated list of StackOverflow 
> questions tagged with `pyspark`, while the mailing lists will offer platforms 
> for deeper discussions and insights within the Spark community.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Summary: Adding "Q Support" and "Mailing Lists" link into PySpark doc 
homepage.  (was: Enriching PySpark doc with "Useful links" including Q 
Support and Mailing Lists)

> Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.
> 
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46116:
---
Labels: pull-request-available  (was: )

> Adding "Q Support" and "Mailing Lists" link into PySpark doc homepage.
> 
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Enriching PySpark doc with "Useful links" including Q Support and Mailing Lists

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Summary: Enriching PySpark doc with "Useful links" including Q Support 
and Mailing Lists  (was: Enriching PySpark doc with "Useful links" Including 
Q Support and Mailing Lists)

> Enriching PySpark doc with "Useful links" including Q Support and Mailing 
> Lists
> -
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Enriching "Useful links" on PySpark docs including "Q Support" and "Mailing Lists"

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Summary: Enriching "Useful links" on PySpark docs including "Q Support" 
and "Mailing Lists"  (was: Enriching PySpark Documentation with "Useful Links" 
Including Q Support and Mailing Lists)

> Enriching "Useful links" on PySpark docs including "Q Support" and "Mailing 
> Lists"
> 
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Enriching PySpark doc with "Useful links" Including Q Support and Mailing Lists

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Summary: Enriching PySpark doc with "Useful links" Including Q Support 
and Mailing Lists  (was: Enriching "Useful links" on PySpark docs including 
"Q Support" and "Mailing Lists")

> Enriching PySpark doc with "Useful links" Including Q Support and Mailing 
> Lists
> -
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46116) Enriching PySpark Documentation with "Useful Links" Including Q Support and Mailing Lists

2023-11-26 Thread Haejoon Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-46116:

Summary: Enriching PySpark Documentation with "Useful Links" Including Q 
Support and Mailing Lists  (was: Add "Q Support" Link to PySpark 
Documentation Homepage)

> Enriching PySpark Documentation with "Useful Links" Including Q Support and 
> Mailing Lists
> ---
>
> Key: SPARK-46116
> URL: https://issues.apache.org/jira/browse/SPARK-46116
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>
> The addition of the "Q Support" link provides quick access to the 
> community-driven Q platform, StackOverflow, where users can seek help and 
> contribute to discussions about PySpark. It enhances the user experience by 
> connecting the documentation with a dynamic and interactive community 
> resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46116) Add "Q Support" Link to PySpark Documentation Homepage

2023-11-26 Thread Haejoon Lee (Jira)

Haejoon Lee created SPARK-46116:
---

 Summary: Add "Q Support" Link to PySpark Documentation Homepage
 Key: SPARK-46116
 URL: https://issues.apache.org/jira/browse/SPARK-46116
 Project: Spark
  Issue Type: Bug
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Haejoon Lee


The addition of the "Q Support" link provides quick access to the 
community-driven Q platform, StackOverflow, where users can seek help and 
contribute to discussions about PySpark. It enhances the user experience by 
connecting the documentation with a dynamic and interactive community resource. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46115) Restrict charsets in encode()

2023-11-26 Thread Max Gekk (Jira)

Max Gekk created SPARK-46115:


 Summary: Restrict charsets in encode()
 Key: SPARK-46115
 URL: https://issues.apache.org/jira/browse/SPARK-46115
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk
Assignee: Max Gekk


Currently the list of supported charsets in encode() is not stable and fully 
depends on the used JDK version. So, sometimes user code might not work because 
a devop changed Java version in Spark cluster. The ticket aims to restrict the 
list of supported charsets by:

{code}
'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-46105) df.emptyDataFrame shows 1 if we repartition(1) in Spark 3.3.x and above

2023-11-26 Thread XiDuo You (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-46105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789908#comment-17789908
 ] 

XiDuo You commented on SPARK-46105:
---

Please see SPARK-39915

> df.emptyDataFrame shows 1 if we repartition(1) in Spark 3.3.x and above
> ---
>
> Key: SPARK-46105
> URL: https://issues.apache.org/jira/browse/SPARK-46105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.3
> Environment: EKS
> EMR
>Reporter: dharani_sugumar
>Priority: Major
> Attachments: Screenshot 2023-11-26 at 11.54.58 AM.png
>
>
> {color:#FF}Version: 3.3.3{color}
>  
> {color:#FF}scala> val df = spark.emptyDataFrame{color}
> {color:#FF}df: org.apache.spark.sql.DataFrame = []{color}
> {color:#FF}scala> df.rdd.getNumPartitions{color}
> {color:#FF}res0: Int = 0{color}
> {color:#FF}scala> df.repartition(1).rdd.getNumPartitions{color}
> {color:#FF}res1: Int = 1{color}
> {color:#FF}scala> df.repartition(1).rdd.isEmpty(){color}
> {color:#FF}[Stage 1:>                                                     
>      (0 + 1) /                                                                
>              res2: Boolean = true{color}
> Version: 3.2.4
> scala> val df = spark.emptyDataFrame
> df: org.apache.spark.sql.DataFrame = []
> scala> df.rdd.getNumPartitions
> res0: Int = 0
> scala> df.repartition(1).rdd.getNumPartitions
> res1: Int = 0
> scala> df.repartition(1).rdd.isEmpty()
> res2: Boolean = true
>  
> {color:#FF}Version: 3.5.0{color}
> {color:#FF}scala> val df = spark.emptyDataFrame{color}
> {color:#FF}df: org.apache.spark.sql.DataFrame = []{color}
> {color:#FF}scala> df.rdd.getNumPartitions{color}
> {color:#FF}res0: Int = 0{color}
> {color:#FF}scala> df.repartition(1).rdd.getNumPartitions{color}
> {color:#FF}res1: Int = 1{color}
> {color:#FF}scala> df.repartition(1).rdd.isEmpty(){color}
> {color:#FF}[Stage 1:>                                                     
>      (0 + 1) /                                                                
>              res2: Boolean = true{color}
>  
> When we do repartition of 1 on an empty dataframe, the resultant partition is 
> 1 in version 3.3.x and 3.5.x whereas when I do the same in version 3.2.x, the 
> resultant partition is 0. May i know why this behaviour is changed from 3.2.x 
> to higher versions. 
>  
> The reason for raising this as a bug is I have a scenario where my final 
> dataframe returns 0 records in EKS(local spark) with single node(driver and 
> executor on the sam node) but it returns 1 in EMR both uses a same spark 
> version 3.3.3. I'm not sure why this behaves different in both the 
> environments. As a interim solution, I had to repartition a empty dataframe 
> if my final dataframe is empty which returns 1 for 3.3.3. Would like to know 
> if this really a bug or this behaviour exists in the future versions and 
> cannot be changed?
>  
> Because, If we go for a spark upgrade and this behaviour is changed, we will 
> face the issue again. 
> Please confirm on this.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-45311) Encoder fails on many "NoSuchElementException: None.get" since 3.4.x, search for an encoder for a generic type, and since 3.5.x isn't "an expression encoder"

2023-11-26 Thread Marc Le Bihan (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-45311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789905#comment-17789905
 ] 

Marc Le Bihan commented on SPARK-45311:
---

Thanks.

I only had to change the :
{{{color:#0033b3}public{color} Ressources getRessources()}} getter to
{{{color:#0033b3}public 
{color}{color:#00}Map{color}<{color:#00}RessourceJeuDeDonneesId{color}, 
{color:#00}Ressource{color}> getRessources()}}
 
to make it working. And change a test from 
{{jeuDeDonnees.getRessources().forEach((ressource) -> 
{color:#871094}LOGGER{color}.info(...))}} to 
{{jeuDeDonnees.getRessources().forEach((id, ressource) -> 
{color:#871094}LOGGER{color}.info(...))}}
and now, all the troubles I had in this issue are solved or have found a 
workaround.

> Encoder fails on many "NoSuchElementException: None.get" since 3.4.x, search 
> for an encoder for a generic type, and since 3.5.x isn't "an expression 
> encoder"
> -
>
> Key: SPARK-45311
> URL: https://issues.apache.org/jira/browse/SPARK-45311
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.0, 3.4.1, 3.5.0
> Environment: Debian 12
> Java 17
> Underlying Spring-Boot 2.7.14
>Reporter: Marc Le Bihan
>Priority: Major
> Attachments: JavaTypeInference_116.png, sparkIssue_02.png
>
>
> If you find it convenient, you might clone the 
> [https://gitlab.com/territoirevif/minimal-tests-spark-issue] project (that 
> does many operations around cities, local authorities and accounting with 
> open data) where I've extracted from my work what's necessary to make a set 
> of 35 tests that run correctly with Spark 3.3.x, and show the troubles 
> encountered with 3.4.x and 3.5.x.
>  
> It is working well with Spark 3.2.x, 3.3.x. But as soon as I selec{*}t Spark 
> 3.4.x{*}, where the encoder seems to have deeply changed, the encoder fails 
> with two problems:
>  
> *1)* It throws *java.util.NoSuchElementException: None.get* messages 
> everywhere.
> Asking over the Internet, I wasn't alone facing this problem. Reading it, 
> you'll see that I've attempted a debug but my Scala skills are low.
> [https://stackoverflow.com/questions/76036349/encoders-bean-doesnt-work-anymore-on-a-java-pojo-with-spark-3-4-0]
> {color:#172b4d}by the way, if possible, the encoder and decoder functions 
> should forward a parameter as soon as the name of the field being handled is 
> known, and then all the long of their process, so that when the encoder is at 
> any point where it has to throw an exception, it knows the field it is 
> handling in its specific call and can send a message like:{color}
> {color:#00875a}_java.util.NoSuchElementException: None.get when encoding [the 
> method or field it was targeting]_{color}
>  
> *2)* *Not found an encoder of the type RS to Spark SQL internal 
> representation.* Consider to change the input type to one of supported at 
> (...)
> Or : Not found an encoder of the type *OMI_ID* to Spark SQL internal 
> representation (...)
>  
> where *RS* and *OMI_ID* are generic types.
> This is strange.
> [https://stackoverflow.com/questions/76045255/encoders-bean-attempts-to-check-the-validity-of-a-return-type-considering-its-ge]
>  
> *3)* When I switch to the *Spark 3.5.0* version, the same problems remain, 
> but another add itself to the list:
> "{*}Only expression encoders are supported for now{*}" on what was accepted 
> and working before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46114) Define IndexError for PySpark error framework

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46114:
---
Labels: pull-request-available  (was: )

> Define IndexError for PySpark error framework
> -
>
> Key: SPARK-46114
> URL: https://issues.apache.org/jira/browse/SPARK-46114
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46114) Define IndexError for PySpark error framework

2023-11-26 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46114:


 Summary: Define IndexError for PySpark error framework
 Key: SPARK-46114
 URL: https://issues.apache.org/jira/browse/SPARK-46114
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46112) Enforce usage of PySpark-specific Exceptions over built-in Python Exceptions

2023-11-26 Thread Haejoon Lee (Jira)

Haejoon Lee created SPARK-46112:
---

 Summary: Enforce usage of PySpark-specific Exceptions over 
built-in Python Exceptions
 Key: SPARK-46112
 URL: https://issues.apache.org/jira/browse/SPARK-46112
 Project: Spark
  Issue Type: Sub-task
  Components: Build, PySpark
Affects Versions: 4.0.0
Reporter: Haejoon Lee


Currently, in the PySpark codebase, there is an inconsistency in the usage of 
exceptions. In some instances, PySpark-specific exceptions are utilized, while 
in others, generic Python built-in exceptions are used. This inconsistency can 
lead to confusion and difficulty in maintaining and debugging the code. See 
[https://github.com/apache/spark/pull/44024] related work to fix such a case.

The goal of this ticket is to establish a standardized practice for error 
handling in PySpark by mandating the use of PySpark-specific exceptions where 
applicable. This will ensure that all exceptions thrown within PySpark adhere 
to a consistent format and standard, making them more informative and easier to 
handle.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46111) Add copyright to the PySpark official documentation.

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46111:
---
Labels: pull-request-available  (was: )

> Add copyright to the PySpark official documentation.
> 
>
> Key: SPARK-46111
> URL: https://issues.apache.org/jira/browse/SPARK-46111
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> Add copyright to the PySpark official documentation by using Sphinx extension.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46111) Add copyright to the PySpark official documentation.

2023-11-26 Thread Haejoon Lee (Jira)

Haejoon Lee created SPARK-46111:
---

 Summary: Add copyright to the PySpark official documentation.
 Key: SPARK-46111
 URL: https://issues.apache.org/jira/browse/SPARK-46111
 Project: Spark
  Issue Type: Bug
  Components: Documentation
Affects Versions: 4.0.0
Reporter: Haejoon Lee


Add copyright to the PySpark official documentation by using Sphinx extension.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45699) Fix "Widening conversion from `TypeA` to `TypeB` is deprecated because it loses precision"

2023-11-26 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45699:


Assignee: Hannah Amundson

> Fix "Widening conversion from `TypeA` to `TypeB` is deprecated because it 
> loses precision"
> --
>
> Key: SPARK-45699
> URL: https://issues.apache.org/jira/browse/SPARK-45699
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Hannah Amundson
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala:1199:67:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.scheduler.TaskSetManager.checkSpeculatableTasks.threshold
> [error]       val threshold = max(speculationMultiplier * medianDuration, 
> minTimeToSpeculation)
> [error]                                                                   ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala:1207:60:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.scheduler.TaskSetManager.checkSpeculatableTasks
> [error]       foundTasks = checkAndSubmitSpeculatableTasks(timeMs, threshold, 
> customizedThreshold = true)
> [error]                                                            ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:137:48:
>  Widening conversion from Int to Float is deprecated because it loses 
> precision. Write `.toFloat` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.IntVectorReader.getFloat
> [error]   override def getFloat(i: Int): Float = getInt(i)
> [error]                                                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:146:49:
>  Widening conversion from Long to Float is deprecated because it loses 
> precision. Write `.toFloat` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.BigIntVectorReader.getFloat
> [error]   override def getFloat(i: Int): Float = getLong(i)
> [error]                                                 ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:147:51:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.BigIntVectorReader.getDouble
> [error]   override def getDouble(i: Int): Double = getLong(i)
> [error]                                                   ^ {code}
>  
>  
> The example of the compilation warning is as above, there are probably over 
> 100 similar cases that need to be fixed.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45699) Fix "Widening conversion from `TypeA` to `TypeB` is deprecated because it loses precision"

2023-11-26 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45699.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43890
[https://github.com/apache/spark/pull/43890]

> Fix "Widening conversion from `TypeA` to `TypeB` is deprecated because it 
> loses precision"
> --
>
> Key: SPARK-45699
> URL: https://issues.apache.org/jira/browse/SPARK-45699
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Hannah Amundson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala:1199:67:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.scheduler.TaskSetManager.checkSpeculatableTasks.threshold
> [error]       val threshold = max(speculationMultiplier * medianDuration, 
> minTimeToSpeculation)
> [error]                                                                   ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala:1207:60:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.scheduler.TaskSetManager.checkSpeculatableTasks
> [error]       foundTasks = checkAndSubmitSpeculatableTasks(timeMs, threshold, 
> customizedThreshold = true)
> [error]                                                            ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:137:48:
>  Widening conversion from Int to Float is deprecated because it loses 
> precision. Write `.toFloat` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.IntVectorReader.getFloat
> [error]   override def getFloat(i: Int): Float = getInt(i)
> [error]                                                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:146:49:
>  Widening conversion from Long to Float is deprecated because it loses 
> precision. Write `.toFloat` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.BigIntVectorReader.getFloat
> [error]   override def getFloat(i: Int): Float = getLong(i)
> [error]                                                 ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:147:51:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead. [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=deprecation, 
> site=org.apache.spark.sql.connect.client.arrow.BigIntVectorReader.getDouble
> [error]   override def getDouble(i: Int): Double = getLong(i)
> [error]                                                   ^ {code}
>  
>  
> The example of the compilation warning is as above, there are probably over 
> 100 similar cases that need to be fixed.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45888) Apply error class framework to state data source & state metadata data source

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45888:
---
Labels: pull-request-available  (was: )

> Apply error class framework to state data source & state metadata data source
> -
>
> Key: SPARK-45888
> URL: https://issues.apache.org/jira/browse/SPARK-45888
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Jungtaek Lim
>Priority: Blocker
>  Labels: pull-request-available
>
> Intended to be a blocker issue for the release of state data source reader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46110) Use error classes in catalog, conf, connect, observation, pandas modules

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46110:
---
Labels: pull-request-available  (was: )

> Use error classes in catalog, conf, connect, observation, pandas modules
> 
>
> Key: SPARK-46110
> URL: https://issues.apache.org/jira/browse/SPARK-46110
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46110) Use error classes in catalog, conf, connect, observation, pandas modules

2023-11-26 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46110:


 Summary: Use error classes in catalog, conf, connect, observation, 
pandas modules
 Key: SPARK-46110
 URL: https://issues.apache.org/jira/browse/SPARK-46110
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Deleted] (SPARK-46109) Migrate to error classes in PySpark

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon deleted SPARK-46109:
-


> Migrate to error classes in PySpark
> ---
>
> Key: SPARK-46109
> URL: https://issues.apache.org/jira/browse/SPARK-46109
> Project: Spark
>  Issue Type: Umbrella
>Reporter: Hyukjin Kwon
>Priority: Major
>
> SPARK-41597 continues here to use error classes in PySpark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46109) Migrate to error classes in PySpark

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46109:
-

> Migrate to error classes in PySpark
> ---
>
> Key: SPARK-46109
> URL: https://issues.apache.org/jira/browse/SPARK-46109
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> SPARK-41597 continues here to use error classes in PySpark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46109) Migrate to error classes in PySpark

2023-11-26 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46109:


 Summary: Migrate to error classes in PySpark
 Key: SPARK-46109
 URL: https://issues.apache.org/jira/browse/SPARK-46109
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


SPARK-41597 continues here to use error classes in PySpark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46108) XML: keepInnerXmlAsRaw option

2023-11-26 Thread Jira

Ufuk Süngü created SPARK-46108:
--

 Summary: XML: keepInnerXmlAsRaw option
 Key: SPARK-46108
 URL: https://issues.apache.org/jira/browse/SPARK-46108
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Ufuk Süngü


Built-in XML data source gives related value and schema of the inner or nested 
elements. However, additional operations should be made by developers manually 
to convert unstructured data to structured, tabular format. If nested elements 
are kept in a format that is suitable with XML (for each level), we can convert 
them easily to a structured, tabular format with the existing methods that have 
already been developed (infer method of XmlInferSchema and parseColumn method 
of StaxXmlParser). Therefore there should be an option that affects 
StaxXmlParser and InferSchema classes to keep inner XML elements in their 
original or raw format.

https://github.com/apache/spark/pull/44022



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-32933) Use keyword-only syntax for keyword_only methods

2023-11-26 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-32933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789874#comment-17789874
 ] 

Hyukjin Kwon commented on SPARK-32933:
--

Here the PR and JIRA: https://github.com/apache/spark/pull/44023 
https://issues.apache.org/jira/browse/SPARK-46107

> Use keyword-only syntax for keyword_only methods
> 
>
> Key: SPARK-32933
> URL: https://issues.apache.org/jira/browse/SPARK-32933
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.1.0
>Reporter: Maciej Szymkiewicz
>Assignee: Maciej Szymkiewicz
>Priority: Minor
> Fix For: 3.1.0
>
>
> Since 3.0, provides syntax for indicating keyword-only arguments ([PEP 
> 3102|https://www.python.org/dev/peps/pep-3102/]).
> It is not a full replacement for our current usage of {{keyword_only}}, but 
> it would allow us to make our expectations explicit:
> {code:python}
> @keyword_only
> def __init__(self, degree=2, inputCol=None, outputCol=None):
> {code}
> {code:python}
> @keyword_only
> def __init__(self, *, degree=2, inputCol=None, outputCol=None):
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46107) Deprecate pyspark.keyword_only API

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46107:
---
Labels: pull-request-available  (was: )

> Deprecate pyspark.keyword_only API
> --
>
> Key: SPARK-46107
> URL: https://issues.apache.org/jira/browse/SPARK-46107
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> See https://issues.apache.org/jira/browse/SPARK-32933. We don't need this 
> anymore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46107) Deprecate pyspark.keyword_only API

2023-11-26 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46107:


 Summary: Deprecate pyspark.keyword_only API
 Key: SPARK-46107
 URL: https://issues.apache.org/jira/browse/SPARK-46107
 Project: Spark
  Issue Type: Improvement
  Components: ML, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


See https://issues.apache.org/jira/browse/SPARK-32933. We don't need this 
anymore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46094) Add support for code profiling executors

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46094:
---
Labels: pull-request-available  (was: )

> Add support for code profiling executors
> 
>
> Key: SPARK-46094
> URL: https://issues.apache.org/jira/browse/SPARK-46094
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Parth Chandra
>Priority: Major
>  Labels: pull-request-available
>
> To profile a Spark application a user or developer has to run a spark job 
> locally on the development machine and use a tool like Java flight recorder, 
> Yourkit, or async-profiler to record profiling information. Because profiling 
> can be expensive, the profiler is typically attached to the Spark jvm process 
> after the process has started and stopped once sufficient profiling data is 
> collected.
> The developers environment is frequently different from the production 
> environment and may not yield accurate information.
> However, the profiling process is hard when a Spark application runs as a 
> distributed job on a cluster where the developer may have limited access to 
> the actual nodes where the executor processes are running.  Also, in 
> environments like Kubernetes where the executor pods may be removed as soon 
> as the job completes, retrieving the profiling information from each executor 
> pod can become quite tricky.
> This feature is to add a low overhead sampling profiler like async-profiler 
> as a built in capability to the Spark job that can be turned on using only 
> user configurable parameters (async-profiler is a low overhead profiler that 
> can be invoked programmatically and is available as a single multi-platform 
> jar (for linux, and mac).
> In addition, for convenience, the feature would save profiling output files 
> to the distributed file system so that information from all executors can be 
> available in a single place.
> The feature would add an executor plugin that does not add any overhead 
> unless enabled and can be configured to accept profiler arguments as a 
> configuration parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46074) [CONNECT][SCALA] Insufficient details in error when a UDF fails

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46074.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43983
[https://github.com/apache/spark/pull/43983]

> [CONNECT][SCALA] Insufficient details in error when a UDF fails
> ---
>
> Key: SPARK-46074
> URL: https://issues.apache.org/jira/browse/SPARK-46074
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Niranjan Jayakar
>Assignee: Niranjan Jayakar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently, when a UDF fails the connect client does not receive the actual 
> error that caused the failure. 
> As an example, the error message looks like -
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: 
> grpc_shaded.io.grpc.StatusRuntimeException: INTERNAL: Job aborted due to 
> stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost 
> task 2.3 in stage 0.0 (TID 10) (10.68.141.158 executor 0): 
> org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] Failed to execute user 
> defined function (` (Main$$$Lambda$4770/1714264622)`: (int) => int). 
> SQLSTATE: 39000 {code}
> In this case, the actual error was a {{{}java.lang.NoClassDefFoundError{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46074) [CONNECT][SCALA] Insufficient details in error when a UDF fails

2023-11-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46074:


Assignee: Niranjan Jayakar

> [CONNECT][SCALA] Insufficient details in error when a UDF fails
> ---
>
> Key: SPARK-46074
> URL: https://issues.apache.org/jira/browse/SPARK-46074
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Niranjan Jayakar
>Assignee: Niranjan Jayakar
>Priority: Major
>  Labels: pull-request-available
>
> Currently, when a UDF fails the connect client does not receive the actual 
> error that caused the failure. 
> As an example, the error message looks like -
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: 
> grpc_shaded.io.grpc.StatusRuntimeException: INTERNAL: Job aborted due to 
> stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost 
> task 2.3 in stage 0.0 (TID 10) (10.68.141.158 executor 0): 
> org.apache.spark.SparkException: [FAILED_EXECUTE_UDF] Failed to execute user 
> defined function (` (Main$$$Lambda$4770/1714264622)`: (int) => int). 
> SQLSTATE: 39000 {code}
> In this case, the actual error was a {{{}java.lang.NoClassDefFoundError{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-45311) Encoder fails on many "NoSuchElementException: None.get" since 3.4.x, search for an encoder for a generic type, and since 3.5.x isn't "an expression encoder"

2023-11-26 Thread Giambattista Bloisi (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-45311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789867#comment-17789867
 ] 

Giambattista Bloisi commented on SPARK-45311:
-

The issue arise while Encoders.bean is inferring the schema for JeuDeDonnees 
class. This class has a field of type Ressources.class which extends a 
LinkedHashMap.

A simple work-around to let the tests pass is to modify the JeuDeDonnees and 
declare ressources as a Map:

 
{code:java}
private Map ressources; 
//...
public Map getRessources() {
//...{code}
and, when required, iterate the values explicitly:

 

 
{code:java}
jeuDeDonnees.getRessources().values().forEach {code}
 

 

The exception is thrown because the code assumes (wrongly in that case) that if 
a class (such as Ressources.class) is a Map, then it has generic type 
information attached to it, here instead the information is available in the 
base/super class.

There is a wider problem behind this. There are cases where mapping to a Spark 
schema would be ambigous, for example:
 * Ressources could have also getters and setters, should it be mapped as a map 
or a struct?
 * A class could implement both List and Map interfaces. should it be mapped as 
an array or a map?

IMO the workaround is also a good idiomatic way to structure beans to be used 
with Spark, as it makes the mapping explicit and removes the possibility of 
ambiguities. 

 

> Encoder fails on many "NoSuchElementException: None.get" since 3.4.x, search 
> for an encoder for a generic type, and since 3.5.x isn't "an expression 
> encoder"
> -
>
> Key: SPARK-45311
> URL: https://issues.apache.org/jira/browse/SPARK-45311
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.0, 3.4.1, 3.5.0
> Environment: Debian 12
> Java 17
> Underlying Spring-Boot 2.7.14
>Reporter: Marc Le Bihan
>Priority: Major
> Attachments: JavaTypeInference_116.png, sparkIssue_02.png
>
>
> If you find it convenient, you might clone the 
> [https://gitlab.com/territoirevif/minimal-tests-spark-issue] project (that 
> does many operations around cities, local authorities and accounting with 
> open data) where I've extracted from my work what's necessary to make a set 
> of 35 tests that run correctly with Spark 3.3.x, and show the troubles 
> encountered with 3.4.x and 3.5.x.
>  
> It is working well with Spark 3.2.x, 3.3.x. But as soon as I selec{*}t Spark 
> 3.4.x{*}, where the encoder seems to have deeply changed, the encoder fails 
> with two problems:
>  
> *1)* It throws *java.util.NoSuchElementException: None.get* messages 
> everywhere.
> Asking over the Internet, I wasn't alone facing this problem. Reading it, 
> you'll see that I've attempted a debug but my Scala skills are low.
> [https://stackoverflow.com/questions/76036349/encoders-bean-doesnt-work-anymore-on-a-java-pojo-with-spark-3-4-0]
> {color:#172b4d}by the way, if possible, the encoder and decoder functions 
> should forward a parameter as soon as the name of the field being handled is 
> known, and then all the long of their process, so that when the encoder is at 
> any point where it has to throw an exception, it knows the field it is 
> handling in its specific call and can send a message like:{color}
> {color:#00875a}_java.util.NoSuchElementException: None.get when encoding [the 
> method or field it was targeting]_{color}
>  
> *2)* *Not found an encoder of the type RS to Spark SQL internal 
> representation.* Consider to change the input type to one of supported at 
> (...)
> Or : Not found an encoder of the type *OMI_ID* to Spark SQL internal 
> representation (...)
>  
> where *RS* and *OMI_ID* are generic types.
> This is strange.
> [https://stackoverflow.com/questions/76045255/encoders-bean-attempts-to-check-the-validity-of-a-return-type-considering-its-ge]
>  
> *3)* When I switch to the *Spark 3.5.0* version, the same problems remain, 
> but another add itself to the list:
> "{*}Only expression encoders are supported for now{*}" on what was accepted 
> and working before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-42655) Incorrect ambiguous column reference error

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-42655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-42655:
---
Labels: pull-request-available  (was: )

> Incorrect ambiguous column reference error
> --
>
> Key: SPARK-42655
> URL: https://issues.apache.org/jira/browse/SPARK-42655
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: Shrikant Prasad
>Assignee: Shrikant Prasad
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> val df1 = 
> sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", 
> "col5")
> val op_cols_same_case = List("id","col2","col3","col4", "col5", "id")
> val df2 = df1.select(op_cols_same_case.head, op_cols_same_case.tail: _*)
> df2.select("id").show()
>  
> This query runs fine.
>  
> But when we change the casing of the op_cols to have mix of upper & lower 
> case ("id" & "ID") it throws an ambiguous col ref error:
>  
> val df1 = 
> sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", 
> "col5")
> val op_cols_mixed_case = List("id","col2","col3","col4", "col5", "ID")
> val df3 = df1.select(op_cols_mixed_case.head, op_cols_mixed_case.tail: _*)
> df3.select("id").show()
> org.apache.spark.sql.AnalysisException: Reference 'id' is ambiguous, could 
> be: id, id.
>   at 
> org.apache.spark.sql.catalyst.expressions.package$AttributeSeq.resolve(package.scala:363)
>   at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveChildren(LogicalPlan.scala:112)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$resolveExpressionByPlanChildren$1(Analyzer.scala:1857)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$resolveExpression$2(Analyzer.scala:1787)
>   at 
> org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:60)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.innerResolve$1(Analyzer.scala:1794)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.resolveExpression(Analyzer.scala:1812)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.resolveExpressionByPlanChildren(Analyzer.scala:1863)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$17.$anonfun$applyOrElse$94(Analyzer.scala:1577)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:193)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:204)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:209)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:209)
>  
> Since, Spark is case insensitive, it should work for second case also when we 
> have upper and lower case column names in the column list.
> It also works fine in Spark 2.3.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45974) Add scan.filterAttributes non-empty judgment for RowLevelOperationRuntimeGroupFiltering

2023-11-26 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-45974.
-
Fix Version/s: 3.5.1
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 43869
[https://github.com/apache/spark/pull/43869]

> Add scan.filterAttributes non-empty judgment for 
> RowLevelOperationRuntimeGroupFiltering
> ---
>
> Key: SPARK-45974
> URL: https://issues.apache.org/jira/browse/SPARK-45974
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Zhen Wang
>Assignee: Zhen Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.1, 4.0.0
>
>
> When scan.filterAttributes is empty, an invalid dynamic Pruning condition 
> will be generated in RowLevelOperationRuntimeGroupFiltering



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45974) Add scan.filterAttributes non-empty judgment for RowLevelOperationRuntimeGroupFiltering

2023-11-26 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-45974:
---

Assignee: Zhen Wang

> Add scan.filterAttributes non-empty judgment for 
> RowLevelOperationRuntimeGroupFiltering
> ---
>
> Key: SPARK-45974
> URL: https://issues.apache.org/jira/browse/SPARK-45974
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Zhen Wang
>Assignee: Zhen Wang
>Priority: Major
>  Labels: pull-request-available
>
> When scan.filterAttributes is empty, an invalid dynamic Pruning condition 
> will be generated in RowLevelOperationRuntimeGroupFiltering



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46106) If the hive table is a table, the outsourcing information will be displayed during ShowCreateTableCommand.

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46106:
---
Labels: pull-request-available  (was: )

> If the hive table is a table, the outsourcing information will be displayed 
> during ShowCreateTableCommand.
> --
>
> Key: SPARK-46106
> URL: https://issues.apache.org/jira/browse/SPARK-46106
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: guihuawen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> For example:
> CREATE EXTERNAL TABLE test_extaral_1 (a String);
> When using SHOW CREATE TABLE test, if it is an external table, it is not 
> displayed whether it is an external table.
> spark-sql> show create table test_extaral_1;
> createtab_stmt
> CREATE TABLE `test`.`test_extaral_1` (
>   `a` STRING)
> USING orc
> LOCATION '/test/test_extaral_1'
>  
> You can modify the display and see whether it is the appearance。
> spark-sql> show create table test_extaral_1;
> createtab_stmt
> CREATE EXTERNAL TABLE `test`.`test_extaral_1` (
>   `a` STRING)
> USING orc
> CREATE EXTERNAL TABLE `test`.`test_extaral_1` (
> LOCATION '/test/test_extaral_1'
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46106) If the hive table is a table, the outsourcing information will be displayed during ShowCreateTableCommand.

2023-11-26 Thread guihuawen (Jira)

guihuawen created SPARK-46106:
-

 Summary: If the hive table is a table, the outsourcing information 
will be displayed during ShowCreateTableCommand.
 Key: SPARK-46106
 URL: https://issues.apache.org/jira/browse/SPARK-46106
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.0
Reporter: guihuawen
 Fix For: 3.5.0


For example:
CREATE EXTERNAL TABLE test_extaral_1 (a String);

When using SHOW CREATE TABLE test, if it is an external table, it is not 
displayed whether it is an external table.

spark-sql> show create table test_extaral_1;
createtab_stmt
CREATE TABLE `test`.`test_extaral_1` (
  `a` STRING)
USING orc
LOCATION '/test/test_extaral_1'

 

You can modify the display and see whether it is the appearance。

spark-sql> show create table test_extaral_1;
createtab_stmt

CREATE EXTERNAL TABLE `test`.`test_extaral_1` (
  `a` STRING)
USING orc
CREATE EXTERNAL TABLE `test`.`test_extaral_1` (
LOCATION '/test/test_extaral_1'

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-39769) Rename trait Unevaluable

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-39769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-39769:
---
Labels: pull-request-available  (was: )

> Rename trait Unevaluable
> 
>
> Key: SPARK-39769
> URL: https://issues.apache.org/jira/browse/SPARK-39769
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Ted Yu
>Priority: Minor
>  Labels: pull-request-available
>
> I came upon `trait Unevaluable` which is defined in 
> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
> Unevaluable is not a word.
> There are `valuable`, `invaluable` but I have never seen Unevaluable.
> This issue renames the trait to Unevaluatable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45826) Add a SQL config for extra stack traces in Origin

2023-11-26 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-45826.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43695
[https://github.com/apache/spark/pull/43695]

> Add a SQL config for extra stack traces in Origin
> -
>
> Key: SPARK-45826
> URL: https://issues.apache.org/jira/browse/SPARK-45826
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add a SQL config to control how many extra stack traces should be captured in 
> the withOrigin method. This should improve user experience in troubleshooting 
> issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46101) Replace (string|array).size with (string|array).length in module SQL

2023-11-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46101:
---
Labels: pull-request-available  (was: )

> Replace (string|array).size with (string|array).length in module SQL
> 
>
> Key: SPARK-46101
> URL: https://issues.apache.org/jira/browse/SPARK-46101
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Assignee: Jiaan Geng
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

50 matches

Mail list logo