mosche commented on code in PR #22157:
URL: https://github.com/apache/beam/pull/22157#discussion_r923244274
##########
runners/spark/2/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/helpers/EncoderFactory.java:
##########
@@ -17,38 +17,35 @@
*/
package org.apache.beam.runners.spark.structuredstreaming.translation.helpers;
-import static org.apache.spark.sql.types.DataTypes.BinaryType;
-
-import java.util.Collections;
-import java.util.List;
-import org.apache.beam.sdk.coders.Coder;
import org.apache.spark.sql.Encoder;
-import org.apache.spark.sql.catalyst.analysis.GetColumnByOrdinal;
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder;
-import org.apache.spark.sql.catalyst.expressions.BoundReference;
-import org.apache.spark.sql.catalyst.expressions.Cast;
import org.apache.spark.sql.catalyst.expressions.Expression;
-import org.apache.spark.sql.types.ObjectType;
-import scala.collection.JavaConversions;
-import scala.reflect.ClassTag;
+import org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke;
+import org.apache.spark.sql.types.DataType;
+import scala.collection.Seq;
+import scala.collection.immutable.List;
+import scala.collection.immutable.Nil$;
+import scala.collection.mutable.WrappedArray;
import scala.reflect.ClassTag$;
public class EncoderFactory {
- public static <T> Encoder<T> fromBeamCoder(Coder<T> coder) {
- Class<? super T> clazz = coder.getEncodedTypeDescriptor().getRawType();
- ClassTag<T> classTag = ClassTag$.MODULE$.apply(clazz);
- List<Expression> serializers =
- Collections.singletonList(
- new EncoderHelpers.EncodeUsingBeamCoder<>(
- new BoundReference(0, new ObjectType(clazz), true), coder));
-
+ static <T> Encoder<T> create(
+ Expression serializer, Expression deserializer, Class<? super T> clazz) {
+ List<Expression> serializers = Nil$.MODULE$.$colon$colon(serializer);
Review Comment:
I agree it's ugly and painful to interface with Scala from Java. BUT that's
also the nature of dealing with Spark at a lower level beyond what's offered on
the Java API to create an efficient runner :(
What I'd suggest is to create a single static utility
`ScalaInterop(erability)` to deal with these. But I'd rather not do it in this
PR as I'm running into more and more conflicts with the pending PR for the
structured streaming runner.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]