mazeboard edited a comment on issue #24299: [SPARK-27388][SQL] expression encoder for objects defined by properties URL: https://github.com/apache/spark/pull/24299#issuecomment-480845183 1. JavaBean do not support Avro fixed types, because the fixed type has one property and is named `bytes`; javaBean only accepts properties prefixed with set/get 2. I believe that the current implementation of Encoders.bean (JavaTypeInference) has a bug, indeed Line 136 in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala should be: val properties = getJavaBeanReadableAndWritableProperties(other) and not val properties = getJavaBeanReadableProperties(other) 3. for ds:Dataset[(Foo, Bar)], ds.map recursively uses the expression encoder (ScalaReflection), thus I got `encoder not found` for all embedded objects even if I declare a bean encoder for Foo and Bar. 4. the Encoders.bean fails for java enums (assertion fails, not a StructType since an enum is saved as String) Point 3 above shows that the addition must be in ScalaReflection indeed: For a type Zoo having two fields of type Foo and Bar val implicit exprEnc = ExpressionEncoder[Zoo]() val r = List(makeZoo).toDS() val ds: Dataset[(Foo, Bar)] = r.map(z => (z.getFoo, z.getBar)) In this example, the addition in this PR works correctly, but if we use bean encoder, it does not work because the map will use recursively ScalaReflection to find encoders for the tuple element types Foo and Bar; even if we declare bean encoders for Zoo, Foo and Bar
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
