mazeboard edited a comment on issue #24299: [SPARK-27388][SQL] expression 
encoder for objects defined by properties
URL: https://github.com/apache/spark/pull/24299#issuecomment-482167159
 
 
   I would like to give an example to explain why the PR addition 
   must be in ScalaReflection
   
   The test is done with spark-sql 2.4.0
   
   The first transformation (map), from Dataset[Foo) to Dataset[Bar],
   pass without error, but the second transformation, from Dataset[Foo] 
   to Dataset[(Foo, Bar)] raises an exception, No Encoder found; this 
   is explained by the fact that the tuple encoder will search recursively
   in ScalaReflection and fails to find an encoder for Foo and Bar
   
   
       import spark.implicits._
   
       implicit val encoderFoo = Encoders.bean[Foo](classOf[Foo])
       implicit val encoderBar = Encoders.bean[Bar](classOf[Bar])
   
       val bar = new Bar
       bar.setBar("ok")
       val a = new Foo
       a.setA(55)
       a.setB(bar)
       val ds = List(a).toDS()
       val x = ds.map(_.getB)               // Ok
       val y = ds.map(x => (x, x.getB))   // Ko - No Encoder found for Bar and 
Foo
   
       class Bar {
           private var bar$: java.lang.String = _
           def setBar(value: java.lang.String): Unit = {
              bar$ = value
           }
           def getBar(): java.lang.String = bar$
        }
   
        class Foo extends Bar {
             var a: java.lang.Integer = _
             var b: Bar = _
             def getA() = a
             def setA(x: java.lang.Integer) {a = x}
             def getB(): Bar = b
             def setB(x: Bar) { b = x}
        }
   
   
   With the PR additon the following code runs without errors
   
       import spark.implicits._
   
       implicit val encoderFoo = ExpressionEncoder[Foo]
   
       val bar = new Bar
       bar.setBar("ok")
       val a = new Foo
       a.setA(55)
       a.setB(bar)
       val ds = List(a).toDS()
       val x = ds.map(_.getB)               // Ok
       val y = ds.map(x => (x, x.getB))   // Ok
   
   
   It is possible to make the first program work by adding the following 
implicit:
   
       implicit val tupleEncoder: Encoder[(Foo, Bar)] = 
Encoders.tuple(encoderFoo, encoderBar)
   
   
   But this becomes rapidly a burden if we need to expose many embedded objects 
within the Avro object.
   
   Also, currently the Encoders.bean does not work for avro objects, for 
instance, fixed type property is ignored; the issue I found with 
javaTypeInference is that the java reflection cannot determine the element type 
of a java.util.List[T] nor the keyType/valueType of java.util.Map[K,V].
   
   Currently to make encoders for avro objects using reflection, as most common 
scala objects, we need to modify ScalaReflection as is done in this PR.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to