Hi there,
I've just noticed that when I write out my binary data I don't appear to have a
schema saved with it. I was under the impression that Avro saves schemas along
with the data. Thanks for any clarification.
Here's my schema:
{
"name": "FileDependency",
"type": "record",
"fields": [
{"name": "file", "type": "string"},
{"name": "imports", "type": {
"type": "array", "items": "string"}
}
]
}
The code to write out my data is as follows (also appreciate any refinement
suggestions as I'm new to Avro):
@Cleanup
InputStream fileDependencySchemaIs = this.getClass()
.getResourceAsStream(FILE_DEPENDENCY_GRAPH_SCHEMA_NAME);
Schema fileDependencySchema = Schema.parse(fileDependencySchemaIs);
GenericDatumWriter<GenericRecord> genericDatumWriter =
new GenericDatumWriter<GenericRecord>(fileDependencySchema);
@Cleanup
OutputStream os = new FileOutputStream(new File(workFolder,
FILE_DEPENDENCY_GRAPH_NAME));
Encoder encoder = new BinaryEncoder(os);
for (Map.Entry<String, Set<String>> entry : fileDependencies
.entrySet()) {
GenericRecord genericRecord = new GenericData.Record(
fileDependencySchema);
genericRecord.put("file", new Utf8(entry.getKey()));
Set<String> imports = entry.getValue();
GenericArray<Utf8> genericArray = new GenericData.Array<Utf8>(
imports.size(),
Schema.createArray(Schema.create(Type.STRING)));
for (String importFile : imports) {
genericArray.add(new Utf8(importFile));
}
genericRecord.put("imports", genericArray);
genericDatumWriter.write(genericRecord, encoder);
}
encoder.flush();
Thanks again.
Kind regards,
Christopher