Hey Sankash, I don't understand a couple of things here:
1) The init() error in SpecificRecord from your original email: I could see that sort of thing being a problem if you were trying to create a PType<SpecificRecord> vs. a PType<SomeImplOfSpecificRecord>, but I don't get why it would be a problem in defining an ordinary DoFn. 2) Why David's suggestion of GenericAvroFunction<T extends SpecificRecordBase> wouldn't be serializable. J On Mon, Jun 22, 2015 at 3:15 PM, David Ortiz <[email protected]> wrote: > How are you getting it into a PCollection? Whatever you're doing there > should work for the function shouldn't it? > > *Sent from my Verizon Wireless 4G LTE DROID* > On Jun 22, 2015 6:09 PM, Sankash Shankar <[email protected]> wrote: > Hello, > > With regards to your question, we will know the class will be one of a > pre-defined list of classes, but the exact class will not be known until > runtime. In addition, the generic class GenericAvroFunction cannot be > defined in a static manner and a generic type, which keeps it from being > serializable. > > Thanks. > > > > On Mon, Jun 22, 2015 at 1:23 PM, David Ortiz <[email protected]> > wrote: > >> When you actually write the code will you know what the avro record >> is? I’ve been able to do something along the lines of >> >> >> >> public class GenericAvroFunction<T extends SpecificRecordBase> extends >> DoFn<T, String> { >> >> … >> >> >> >> public void process(T input, Emitter<String> emitter) { >> >> … >> >> } >> >> } >> >> >> >> then parameterizing it in the various pipelines that use it. Not sure >> with regards to making it work at run time though. >> >> >> >> *From:* Sankash Shankar [mailto:[email protected]] >> *Sent:* Monday, June 22, 2015 4:18 PM >> *To:* [email protected] >> *Subject:* How to write a generic transform method that will act upon >> generated avro objects in a generic fashion >> >> >> >> Hello. >> >> >> >> I am writing a Crunch job that takes in an arbitrary class that extends >> SpecificRecord and performs a transformation on the fields in the class. I >> am attempting to write a parallelDo function on these classes, but >> >> *public static *PCollection<String> function(PCollection<? *extends >> *SpecificRecord> coll) { >> coll.parallelDo(*new *DoFn<? *extends *SpecificRecord, String>() { >> ... >> }, Avros.*strings*()); >> } >> >> will not compile given it expects a type at compile-time >> >> *will not compile given it expects a type at compile time, while * >> >> *public static *PCollection<String> >> transformAvroToCsv(PCollection<SpecificRecord> coll) { >> coll.parallelDo(*new *DoFn<SpecificRecord, String>() { >> @Override >> *public void *process(SpecificRecord input, Emitter<String> emitter) { >> } >> }, Avros.*strings*()); >> *return null*; >> } >> >> *will fail at run-time due to SpecificRecord not having an init >> constructor.* >> >> What is the standard way for taking in generic avro records and having >> a generic >> >> transform method to call on them? >> >> >> >> Thanks. >> *This email is intended only for the use of the individual(s) to >> whom it is addressed. If you have received this communication in error, >> please immediately notify the sender and delete the original email.* >> > > *This email is intended only for the use of the individual(s) to whom it > is addressed. If you have received this communication in error, please > immediately notify the sender and delete the original email.* >
