Vinay varma created SPARK-19315: ----------------------------------- Summary: StructType should support nested lookup; throws IllegalArgumentException Key: SPARK-19315 URL: https://issues.apache.org/jira/browse/SPARK-19315 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.0.2, 1.6.1 Reporter: Vinay varma Priority: Minor
Datasets supports class composition. .joinWith operation in dataset also results in composed type. StructType throws IllegalArgumentException for a nested lookup. Since many validations check the schema, we are limiting these to use flattened datasets only (ex: org.apache.spark.ml.feature.StringIndexer) Is there any reason for not supporting such operations? >From an initial check, looks like adding support to such look ups will break >the existing contract at: org.apache.spark.sql.types.StructType def fieldIndex(name: String): Int Example code, with breaking code: case class A(id: Int, name: String) case class B(id: Int, location: String) class TestCompositionStruct extends FunSuite { val spark = SparkSession.builder().appName("TestCompositionStruct").master("local[4]").getOrCreate() import spark.implicits._ val adf = spark.createDataFrame(List(A(1, "X"), A(2, "Y"))).as[A] val bdf = spark.createDataFrame(List(B(1, "X_loc"), B(2, "Y_loc"))).as[B] test("supportNestedDataset") { val jdf = adf.joinWith(bdf, adf("id") === bdf("id")).withColumnRenamed("_1", "a").withColumnRenamed("_2", "b").as[(A, B)] assert(jdf.select("a.id").count() > 0) intercept[IllegalArgumentException](jdf.schema("a.id")) } } -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org