Hi All,

We have some legacy file format, which I would need to migrate to Avro
format. The tricky part is that the records basically have

   - some common fields,
   - a discriminator field and
   - some unique fields, specific to the type selected by the discriminator
   field

all of them is stored in the same file, without any order, mixed with each
other.

In Java/object-oriented programming, one could represent our records
concept as the following:

abstract class RecordWithCommonFields {
   private Long commonField1;
   private String commonField2;
   ...
}

class RecordTypeA extends RecordWithCommonFields {
   private Integer specificToA1;
   private String specificToA1;
   ...
}

class RecordTypeB extends RecordWithCommonFields {
   private Boolean specificToB1;
   private String specificToB1;
   ...
}

Imagine the data being something like this:

commonField1Value;commonField2Value,TYPE_IS_A,specificToA1Value,specificToA1Value
commonField1Value;commonField2Value,TYPE_IS_B,specificToB1Value,specificToB1Value

So I would like to process an incoming file and write its content to Avro
format, somehow representing the different types of the records:
technically this would be an array, which should hold different types of
records.

Can someone give me some ideas on how to achieve this?

Thanks,
Peter

Reply via email to