Hey Sankash,

I don't understand a couple of things here:

1) The init() error in SpecificRecord from your original email: I could see
that sort of thing being a problem if you were trying to create a
PType<SpecificRecord> vs. a PType<SomeImplOfSpecificRecord>, but I don't
get why it would be a problem in defining an ordinary DoFn.
2) Why David's suggestion of GenericAvroFunction<T extends
SpecificRecordBase> wouldn't be serializable.

J

On Mon, Jun 22, 2015 at 3:15 PM, David Ortiz <[email protected]>
wrote:

>  How are you getting it into a PCollection?  Whatever you're doing there
> should work for the function shouldn't it?
>
>  *Sent from my Verizon Wireless 4G LTE DROID*
>  On Jun 22, 2015 6:09 PM, Sankash Shankar <[email protected]> wrote:
>  Hello,
>
>  With regards to your question, we will know the class will be one of a
> pre-defined list of classes, but the exact class will not be known until
> runtime. In addition, the generic class GenericAvroFunction cannot be
> defined in a static manner and a generic type, which keeps it from being
> serializable.
>
>  Thanks.
>
>
>
> On Mon, Jun 22, 2015 at 1:23 PM, David Ortiz <[email protected]>
> wrote:
>
>>  When you actually write the code will you know what the avro record
>> is?  I’ve been able to do something along the lines of
>>
>>
>>
>> public class GenericAvroFunction<T extends SpecificRecordBase> extends
>> DoFn<T, String> {
>>
>> …
>>
>>
>>
>> public void process(T input, Emitter<String> emitter) {
>>
>> …
>>
>> }
>>
>> }
>>
>>
>>
>> then parameterizing it in the various pipelines that use it.  Not sure
>> with regards to making it work at run time though.
>>
>>
>>
>> *From:* Sankash Shankar [mailto:[email protected]]
>> *Sent:* Monday, June 22, 2015 4:18 PM
>> *To:* [email protected]
>> *Subject:* How to write a generic transform method that will act upon
>> generated avro objects in a generic fashion
>>
>>
>>
>> Hello.
>>
>>
>>
>> I am writing a Crunch job that takes in an arbitrary class that extends
>> SpecificRecord and performs a transformation on the fields in the class. I
>> am attempting to write a parallelDo function on these classes, but
>>
>> *public static *PCollection<String> function(PCollection<? *extends 
>> *SpecificRecord> coll) {
>>   coll.parallelDo(*new *DoFn<? *extends *SpecificRecord, String>() {
>>     ...
>>   }, Avros.*strings*());
>> }
>>
>> will not compile given it expects a type at compile-time
>>
>>  *will not compile given it expects a type at compile time, while *
>>
>>  *public static *PCollection<String> 
>> transformAvroToCsv(PCollection<SpecificRecord> coll) {
>>   coll.parallelDo(*new *DoFn<SpecificRecord, String>() {
>>     @Override
>>     *public void *process(SpecificRecord input, Emitter<String> emitter) {
>>     }
>>   }, Avros.*strings*());
>>   *return null*;
>> }
>>
>>  *will fail at run-time due to SpecificRecord not having an init 
>> constructor.*
>>
>>   What is the standard way for taking in generic avro records and having
>> a generic
>>
>> transform method to call on them?
>>
>>
>>
>> Thanks.
>>     *This email is intended only for the use of the individual(s) to
>> whom it is addressed. If you have received this communication in error,
>> please immediately notify the sender and delete the original email.*
>>
>
>  *This email is intended only for the use of the individual(s) to whom it
> is addressed. If you have received this communication in error, please
> immediately notify the sender and delete the original email.*
>

Reply via email to