Sajeev Ramakrishnan created SPARK-22663:
-------------------------------------------

             Summary: Spark DataSet to case class mapping mismatches
                 Key: SPARK-22663
                 URL: https://issues.apache.org/jira/browse/SPARK-22663
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.2.0
            Reporter: Sajeev Ramakrishnan
            Priority: Minor


Dear Team,
  As of now when we create a Dataset from a datasource, we give 
as[<case-class>] at the end to do the mapping. But if the case class is having 
an extra attribute, then spark throws error.

Eg. 

case class MyClass(
                var line: String = "",
                var prevLine: String = ""
)

val raw= spark.read.textFile(<file>)
var a:Dataset[MyClass] = raw.withColumn("line", split(col("value"), 
"\\t")).select(
      col("line").getItem(0).as("line")
).as[MyClass]

This code fails telling that there is no match for prevLine

Fixing this would be easy to build spark programs with Datasets where so many 
joins are involved and the result would add multiple columns everytime. It will 
be difficult to have different case classes for different joins.

Thanks & Regards,
Sajeev Ramakrishnan



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to