Vinay varma created SPARK-19315:
-----------------------------------

             Summary: StructType should support nested lookup; throws 
IllegalArgumentException
                 Key: SPARK-19315
                 URL: https://issues.apache.org/jira/browse/SPARK-19315
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.2, 1.6.1
            Reporter: Vinay varma
            Priority: Minor


Datasets supports class composition. .joinWith operation in dataset also 
results in composed type. StructType throws IllegalArgumentException for a 
nested lookup. Since many validations check the schema, we are limiting these 
to use flattened datasets only (ex: org.apache.spark.ml.feature.StringIndexer)

Is there any reason for not supporting such operations?

>From an initial check, looks like adding support to such look ups will break 
>the existing contract at: 
org.apache.spark.sql.types.StructType 
     def fieldIndex(name: String): Int 

Example code, with breaking code:

case class A(id: Int, name: String)

case class B(id: Int, location: String)

class TestCompositionStruct extends FunSuite {
  val spark = 
SparkSession.builder().appName("TestCompositionStruct").master("local[4]").getOrCreate()

  import spark.implicits._

  val adf = spark.createDataFrame(List(A(1, "X"), A(2, "Y"))).as[A]
  val bdf = spark.createDataFrame(List(B(1, "X_loc"), B(2, "Y_loc"))).as[B]

  test("supportNestedDataset") {
    val jdf = adf.joinWith(bdf, adf("id") === 
bdf("id")).withColumnRenamed("_1", "a").withColumnRenamed("_2", "b").as[(A, B)]
    assert(jdf.select("a.id").count() > 0)
    intercept[IllegalArgumentException](jdf.schema("a.id"))
  }
}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to