[jira] [Commented] (SPARK-42258) pyspark.sql.functions should not expose typing.cast

2023-03-03 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696237#comment-17696237
 ] 

Apache Spark commented on SPARK-42258:
--

User 'FurcyPin' has created a pull request for this issue:
https://github.com/apache/spark/pull/40271

> pyspark.sql.functions should not expose typing.cast
> ---
>
> Key: SPARK-42258
> URL: https://issues.apache.org/jira/browse/SPARK-42258
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.1
>Reporter: Furcy Pin
>Priority: Minor
>
> In pyspark, the `pyspark.sql.functions` modules imports and exposes the 
> method `typing.cast`.
> This may lead to errors from users that can be hard to spot.
> *Example*
> It took me a few minutes to understand why the following code:
>  
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as f
> spark = SparkSession.builder.getOrCreate()
> df = spark.sql("""SELECT 1 as a""")
> df.withColumn("a", f.cast("STRING", f.col("a"))).printSchema()  {code}
> which executes without any problem, gives the following result:
>  
>  
> {code:java}
> root
> |-- a: integer (nullable = false){code}
> This is because `f.cast` here calls `typing.cast, and the correct syntax is:
> {code:java}
> df.withColumn("a", f.col("a").cast("STRING")).printSchema(){code}
>  
> which indeed gives:
> {code:java}
> root
>  |-- a: string (nullable = false) {code}
> *Suggestion of solution*
> Option 1: The methods imported in the module `pyspark.sql.functions` could be 
> obfuscated to prevent this. For instance:
> {code:java}
> from typing import cast as _cast{code}
> Option 2: only import `typing` and replace all occurrences of `cast` with 
> `typing.cast`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42258) pyspark.sql.functions should not expose typing.cast

2023-02-13 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688191#comment-17688191
 ] 

Hyukjin Kwon commented on SPARK-42258:
--

either way is fine to me

> pyspark.sql.functions should not expose typing.cast
> ---
>
> Key: SPARK-42258
> URL: https://issues.apache.org/jira/browse/SPARK-42258
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.1
>Reporter: Furcy Pin
>Priority: Minor
>
> In pyspark, the `pyspark.sql.functions` modules imports and exposes the 
> method `typing.cast`.
> This may lead to errors from users that can be hard to spot.
> *Example*
> It took me a few minutes to understand why the following code:
>  
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as f
> spark = SparkSession.builder.getOrCreate()
> df = spark.sql("""SELECT 1 as a""")
> df.withColumn("a", f.cast("STRING", f.col("a"))).printSchema()  {code}
> which executes without any problem, gives the following result:
>  
>  
> {code:java}
> root
> |-- a: integer (nullable = false){code}
> This is because `f.cast` here calls `typing.cast, and the correct syntax is:
> {code:java}
> df.withColumn("a", f.col("a").cast("STRING")).printSchema(){code}
>  
> which indeed gives:
> {code:java}
> root
>  |-- a: string (nullable = false) {code}
> *Suggestion of solution*
> Option 1: The methods imported in the module `pyspark.sql.functions` could be 
> obfuscated to prevent this. For instance:
> {code:java}
> from typing import cast as _cast{code}
> Option 2: only import `typing` and replace all occurrences of `cast` with 
> `typing.cast`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42258) pyspark.sql.functions should not expose typing.cast

2023-02-13 Thread Furcy Pin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688120#comment-17688120
 ] 

Furcy Pin commented on SPARK-42258:
---

Sure, I can do that. 
Do you have any preference between option 1 and 2 ? I believe 2 is cleaner.

> pyspark.sql.functions should not expose typing.cast
> ---
>
> Key: SPARK-42258
> URL: https://issues.apache.org/jira/browse/SPARK-42258
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.1
>Reporter: Furcy Pin
>Priority: Minor
>
> In pyspark, the `pyspark.sql.functions` modules imports and exposes the 
> method `typing.cast`.
> This may lead to errors from users that can be hard to spot.
> *Example*
> It took me a few minutes to understand why the following code:
>  
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as f
> spark = SparkSession.builder.getOrCreate()
> df = spark.sql("""SELECT 1 as a""")
> df.withColumn("a", f.cast("STRING", f.col("a"))).printSchema()  {code}
> which executes without any problem, gives the following result:
>  
>  
> {code:java}
> root
> |-- a: integer (nullable = false){code}
> This is because `f.cast` here calls `typing.cast, and the correct syntax is:
> {code:java}
> df.withColumn("a", f.col("a").cast("STRING")).printSchema(){code}
>  
> which indeed gives:
> {code:java}
> root
>  |-- a: string (nullable = false) {code}
> *Suggestion of solution*
> Option 1: The methods imported in the module `pyspark.sql.functions` could be 
> obfuscated to prevent this. For instance:
> {code:java}
> from typing import cast as _cast{code}
> Option 2: only import `typing` and replace all occurrences of `cast` with 
> `typing.cast`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42258) pyspark.sql.functions should not expose typing.cast

2023-02-12 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687757#comment-17687757
 ] 

Hyukjin Kwon commented on SPARK-42258:
--

Good point. Are you interested in submitting a PR?

> pyspark.sql.functions should not expose typing.cast
> ---
>
> Key: SPARK-42258
> URL: https://issues.apache.org/jira/browse/SPARK-42258
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.1
>Reporter: Furcy Pin
>Priority: Minor
>
> In pyspark, the `pyspark.sql.functions` modules imports and exposes the 
> method `typing.cast`.
> This may lead to errors from users that can be hard to spot.
> *Example*
> It took me a few minutes to understand why the following code:
>  
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as f
> spark = SparkSession.builder.getOrCreate()
> df = spark.sql("""SELECT 1 as a""")
> df.withColumn("a", f.cast("STRING", f.col("a"))).printSchema()  {code}
> which executes without any problem, gives the following result:
>  
>  
> {code:java}
> root
> |-- a: integer (nullable = false){code}
> This is because `f.cast` here calls `typing.cast, and the correct syntax is:
> {code:java}
> df.withColumn("a", f.col("a").cast("STRING")).printSchema(){code}
>  
> which indeed gives:
> {code:java}
> root
>  |-- a: string (nullable = false) {code}
> *Suggestion of solution*
> Option 1: The methods imported in the module `pyspark.sql.functions` could be 
> obfuscated to prevent this. For instance:
> {code:java}
> from typing import cast as _cast{code}
> Option 2: only import `typing` and replace all occurrences of `cast` with 
> `typing.cast`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org