I’m pretty sure AVRO only supports a single schema per file. You can create columns of record type and put each type of record in the correct column but at that point I might just look at using a MAP data type and write a custom record reader. Normally you’d split the data into a separate file for each schema but I can understand situations where that’s not ideal. I’ve got several flows that put XML keys into a MAP column and then split them out in Hive later.
Thanks Shawn From: Eric Chaves <[email protected]> Sent: Sunday, March 17, 2019 11:14 AM To: [email protected] Subject: Is it possible to use declare an Avro schema for multi-record files? Hi folks, Is possible to declare an Avro schema for a ConvertRecord processor to handle multi-record file ie a file where each line may be a different avro record? Something like this: { "type" : "record", "namespace" : "com.acme", "name" : "OrderFile", "fields" : [ { "type" : "record", "namespace" : "com.acme", "name" : "HeaderRecord", "fields" : [ {"name":"PNSTORE", "type": "string"}, {"name":"STORENAME", "type": "string"}, {"name":"EXTRACTIONDATE", "type": "string"} ] }, { "type" : "record", "namespace" : "com.acme", "name" : "OrderRecord", "fields" : [ { "name": "SALESMAN", "type": "string" }, { "name": "ORDER_NUMBER", "type": "string" }, { "name": "DUE_DATE", "type": "string" }, { "name": "ORDER_AMOUNT", "type": "long" } ] }, { "type" : "record", "namespace" : "com.acme", "name" : "TrailerRecord", "fields" : [ {"name":"TOTAL_RECORDS", "type": "long"}, {"name":"TOTAL_AMOUNT", "type": "long"} ] } ] } Thanks in advance, Eric
