[ 
https://issues.apache.org/jira/browse/SPARK-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-24988.
----------------------------------
    Resolution: Won't Fix

> Add a castBySchema method which casts all the values of a DataFrame based on 
> the DataTypes of a StructType
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24988
>                 URL: https://issues.apache.org/jira/browse/SPARK-24988
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: mahmoud mehdi
>            Priority: Minor
>
> The main goal of this User Story is to extend the Dataframe methods in order 
> to add a method which casts all the values of a Dataframe, based on the 
> DataTypes of a StructType.
> This feature can be useful when we have a large dataframe, and that we need 
> to make multiple casts. In that case, we won't have to cast each value 
> independently, all we have to do is to pass a StructType to the method 
> castBySchema with the types we need (In real world examples, this schema is 
> generally provided by the client, which was my case).
> I'll explain the new feature via an example, let's create a dataframe of 
> strings : 
> {code:java}
> val df = Seq(("test1", "0"), ("test2", "1")).toDF("name", "id")
> {code}
> Let's suppose that we want to cast the second column's values of the 
> dataframe to integers, all we have to do is the following : 
> {code:java}
> val schema = StructType( Seq( StructField("name", StringType, true), 
> StructField("id", IntegerType, true))){code}
> {code:java}
> df.castBySchema(schema)
> {code}
> I made sure that castBySchema works also with nested StructTypes by adding 
> several tests.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to