mazeboard edited a comment on issue #24299: [SPARK-27388][SQL] expression encoder for objects defined by properties URL: https://github.com/apache/spark/pull/24299#issuecomment-482167159 I would like to give an example to explain why the PR addition must be in ScalaReflection The test is done with spark-sql 2.4.0 The first transformation (map), from Dataset[Foo) to Dataset[Bar], pass without error, but the second transformation, from Dataset[Foo] to Dataset[(Foo, Bar)] raises an exception, No Encoder found; this is explained by the fact that the tuple encoder will search recursively in ScalaReflection and fails to find an encoder for Foo and Bar import spark.implicits._ implicit val encoderFoo = Encoders.bean[Foo](classOf[Foo]) implicit val encoderBar = Encoders.bean[Bar](classOf[Bar]) val bar = new Bar bar.setBar("ok") val a = new Foo a.setA(55) a.setB(bar) val ds = List(a).toDS() val x = ds.map(_.getB) // Ok val y = ds.map(x => (x, x.getB)) // Ko - No Encoder found for Bar and Foo class Bar { private var bar$: java.lang.String = _ def setBar(value: java.lang.String): Unit = { bar$ = value } def getBar(): java.lang.String = bar$ } class Foo extends Bar { var a: java.lang.Integer = _ var b: Bar = _ def getA() = a def setA(x: java.lang.Integer) {a = x} def getB(): Bar = b def setB(x: Bar) { b = x} } With the PR additon the following code runs without errors import spark.implicits._ implicit val encoderFoo = ExpressionEncoder[Foo] val bar = new Bar bar.setBar("ok") val a = new Foo a.setA(55) a.setB(bar) val ds = List(a).toDS() val x = ds.map(_.getB) // Ok val y = ds.map(x => (x, x.getB)) // Ok It is possible to make the first program work by adding the following implicit: implicit val tupleEncoder: Encoder[(Foo, Bar)] = Encoders.tuple(encoderFoo, encoderBar) But this becomes rapidly a burden if we need to expose many embedded objects within the Avro object. Also, currently the Encoders.bean does not work for avro objects, for instance, fixed type property is ignored; ~~the issue I found with javaTypeInference is that the java reflection cannot determine the element type of a java.util.List[T] nor the keyType/valueType of java.util.Map[K,V].~~ *solved (see next message)* Currently to make encoders for avro objects using reflection, as most common scala objects, we need to modify ScalaReflection as is done in this PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
