Hi Peter, I think what you need is a union <https://avro.apache.org/docs/1.8.1/spec.html#Unions> of records. What comes to my mind is to create a record type with these fields: all common field (commonField1, commonField2) and an additional union field for the derived types (not nullable union, since your base class is abstract). The union is union of your concrete records: RecordTypeB (with the fields specific only for this derived type), RecordTypeA (with the fields specific only for this derived type).
Regards, Nandor On Wed, Jun 20, 2018 at 3:35 PM, Horváth Péter Gergely < [email protected]> wrote: > Hi All, > > We have some legacy file format, which I would need to migrate to Avro > format. The tricky part is that the records basically have > > - some common fields, > - a discriminator field and > - some unique fields, specific to the type selected by the > discriminator field > > all of them is stored in the same file, without any order, mixed with each > other. > > In Java/object-oriented programming, one could represent our records > concept as the following: > > abstract class RecordWithCommonFields { > private Long commonField1; > private String commonField2; > ... > } > > class RecordTypeA extends RecordWithCommonFields { > private Integer specificToA1; > private String specificToA1; > ... > } > > class RecordTypeB extends RecordWithCommonFields { > private Boolean specificToB1; > private String specificToB1; > ... > } > > Imagine the data being something like this: > > commonField1Value;commonField2Value,TYPE_IS_A,specificToA1Value, > specificToA1Value > commonField1Value;commonField2Value,TYPE_IS_B,specificToB1Value, > specificToB1Value > > So I would like to process an incoming file and write its content to Avro > format, somehow representing the different types of the records: > technically this would be an array, which should hold different types of > records. > > Can someone give me some ideas on how to achieve this? > > Thanks, > Peter > >
