----- Forwarded Message -----
>From: Andrew Kenworthy <[email protected]>
>To: Gaurav Nanda <[email protected]> 
>Sent: Thursday, December 8, 2011 3:47 PM
>Subject: Re: Collecting union-ed Records in AvroReducer
> 
>
>Hallo Gaurav,
>
>
>Thank you for your reply. My problem is that the writer is implemented by 
>GenericDatumWriter which is called via hadoop i.e. in my code I only have 
>direct access to an AvroCollector object, which - several layers later - 
>invokes a GenericDatumWriter. I don't really want to have re-implement a lot 
>of code that the avro-mapred package provides for me.
>
>
>But I think I can get around this by defining my output schema as being one 
>with a nested record structure, and "embed" my type B record within the type 
>"A". That way i am emitting a single record, albeit holding a composition of 
>my two entities.
>
>
>Andrew
>
>
>
>>________________________________
>> From: Gaurav Nanda <[email protected]>
>>To: [email protected]; Andrew Kenworthy <[email protected]> 
>>Sent: Thursday, December 8, 2011 3:32 PM
>>Subject: Re: Collecting union-ed Records in AvroReducer
>> 
>>You don't need to construct a record object. You can just write your
>>RecordA/RecorbB objects directly.
>>
>>Sample Writer:
>>            DatumWriter<Object> datum = new 
>>GenericDatumWriter<Object>(schema);
>>        DataFileWriter<Object> writer = new DataFileWriter<Object>(datum);
>>
>>            FileOutputStream out = new FileOutputStream("h:\\TestFile.avro");
>>        
>>        writer.create(schema, out);
>>        writer.append(1050324); //You can write your recordA/recordB here.
>>    
>>        writer.close();
>>
>>Sample Reader:
>>
>>            File out = new File("h:\\TestFile.avro");
>>            GenericDatumReader<Object> datum
 = new GenericDatumReader<Object>();
>>        DataFileReader<Object> reader = new DataFileReader<Object>(out, 
>>datum);
>>
>>            while (reader.hasNext()) {
>>          System.out.println(reader.next());
>>        }
>>        reader.close();
>>
>>Hope this helps.
>>
>>Thanks,
>>Gaurav Nanda
>>
>>On Thu, Dec 8, 2011 at 5:40 PM, Andrew Kenworthy <[email protected]> 
>>wrote:
>>> Hallo,
>>>
>>> is it possible to write/collect a union-ed record from an avro reducer?
>>>
>>> I have a reduce class (extending AvroReducer), and the output schema is a
>>> union schema of record type A and record type B. In the reduce logic I want
>>> to combine instances of A and B in the same
 datum, passing it to my
>>> Avrocollector. My code looks a bit like this:
>>>
>>> Record unionRecord = new GenericData.Record(myUnionSchema); // not legal!
>>> unionRecord.put("type A", recordA);
>>> unionRecord.put("type B", recordB);
>>> collector.collect(unionRecord);
>>>
>>> but GenericData.Record constructor expects a Record Schema. How can I write
>>> both records such that they appear in the same output datum?
>>>
>>> Andrew
>>
>>
>>
>
>

Reply via email to