When you actually write the code will you know what the avro record is? I’ve
been able to do something along the lines of
public class GenericAvroFunction<T extends SpecificRecordBase> extends DoFn<T,
String> {
…
public void process(T input, Emitter<String> emitter) {
…
}
}
then parameterizing it in the various pipelines that use it. Not sure with
regards to making it work at run time though.
From: Sankash Shankar [mailto:[email protected]]
Sent: Monday, June 22, 2015 4:18 PM
To: [email protected]
Subject: How to write a generic transform method that will act upon generated
avro objects in a generic fashion
Hello.
I am writing a Crunch job that takes in an arbitrary class that extends
SpecificRecord and performs a transformation on the fields in the class. I am
attempting to write a parallelDo function on these classes, but
public static PCollection<String> function(PCollection<? extends
SpecificRecord> coll) {
coll.parallelDo(new DoFn<? extends SpecificRecord, String>() {
...
}, Avros.strings());
}
will not compile given it expects a type at compile-time
will not compile given it expects a type at compile time, while
public static PCollection<String>
transformAvroToCsv(PCollection<SpecificRecord> coll) {
coll.parallelDo(new DoFn<SpecificRecord, String>() {
@Override
public void process(SpecificRecord input, Emitter<String> emitter) {
}
}, Avros.strings());
return null;
}
will fail at run-time due to SpecificRecord not having an init constructor.
What is the standard way for taking in generic avro records and having a generic
transform method to call on them?
Thanks.
This email is intended only for the use of the individual(s) to whom it is
addressed. If you have received this communication in error, please immediately
notify the sender and delete the original email.