[ 
https://issues.apache.org/jira/browse/SPARK-12916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk closed SPARK-12916.
---------------------------
    Resolution: Won't Fix

Since Row is now a subclass of Tuple we don't really need this anymore.

> Support Row.fromSeq and Row.toSeq methods in pyspark
> ----------------------------------------------------
>
>                 Key: SPARK-12916
>                 URL: https://issues.apache.org/jira/browse/SPARK-12916
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>            Reporter: Shubhanshu Mishra
>            Priority: Minor
>              Labels: dataframe, pyspark, row, sql
>
> Pyspark should also have access to the Row functions like fromSeq and toSeq 
> which are exposed in the scala api. 
> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Row
> This will be useful when constructing custom columns from function called in 
> dataframes. A good example is present in the following SO threat: 
> http://stackoverflow.com/questions/32196207/derive-multiple-columns-from-a-single-column-in-a-spark-dataframe
> {code:python}
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.Row
> def foobarFunc(x: Long, y: Double, z: String): Seq[Any] = 
>   Seq(x * y, z.head.toInt * y)
> val schema = StructType(df.schema.fields ++
>   Array(StructField("foo", DoubleType), StructField("bar", DoubleType)))
> val rows = df.rdd.map(r => Row.fromSeq(
>   r.toSeq ++
>   foobarFunc(r.getAs[Long]("x"), r.getAs[Double]("y"), r.getAs[String]("z"))))
> val df2 = sqlContext.createDataFrame(rows, schema)
> df2.show
> // +---+----+---+----+-----+
> // |  x|   y|  z| foo|  bar|
> // +---+----+---+----+-----+
> // |  1| 3.0|  a| 3.0|291.0|
> // |  2|-1.0|  b|-2.0|-98.0|
> // |  3| 0.0|  c| 0.0|  0.0|
> // +---+----+---+----+-----+
> {code}
> I am ready to work on this feature. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to