I recently extended a tool used at my work by adding an Avro output module.
The module works fine except that it appears to ignore default values in
the schema.
My program does something like this:
presuming schemaBuffer contains
{"type":"record",
"name":"test",
"fields":[
{"name":"foo", "type":"int"},
{"name":"hat", "type":"int", "default":12},
{"name":"bar", "type":"string"}
]
}
avro_schema_t schema;
avro_schema_from_json_length(schemaBuffer, schemaLen, &schema);
avro_value_iface_t * iface;
avro_value_t writer_value, field;
avro_file_writer_t avro_writer;
iface = avro_generic_class_from_schema(schema);
avro_generic_value_new(iface, &writer_value);
avro_file_writer_create_with_codec(outFilePath, schema, &avro_writer,
"defalte", blockSize);
//iterate over some data structure containing src data {
//for int values {
avro_value_set_int(&field, someIntValue);
}
//similar code for other types...
}
avro_file_writer_append_value(avro_writer, &writer_value)
//clean up
//flush etc on program exit
The result is that the program correctly creates an avro encoded file with
one record for each of my input records with all the correct values etc....
Except! The schema at the top of the output file created is different to
the input schema. It now looks like:
{"type":"record",
"name":"test",
"fields":[
{"name":"foo", "type":"int"},
{"name":"hat", "type":"int"},
{"name":"bar", "type":"string"}
]
}
The default property just seems to be completely ignored by the schema
parser or otherwise not reproduced by the schema writer.
Having a look at the source code I came across this concerning struct in
schema.h:
struct avro_record_field_t {
int index;
char *name;
avro_schema_t type;
/*
* TODO: default values
*/
};
So it appears that default values are not supported by Avro C?
I'm pretty confused however as the documentation at
http://avro.apache.org/docs/1.7.7/api/c/index.html states:
The C implementation supports:
-
binary encoding/decoding of all primitive and complex data types
-
storage to an Avro Object Container File
-
schema resolution, promotion and projection
-
validating and non-validating mode for writing Avro data
The C implementation is lacking:
-
RPC
Is the documentation wrong or am I just missing something?
I couldn't find any evidence that default values are supported after
reading over the source. If this feature still planned to be implemented?
Should the documentation be updated to reflect that the C implementation
does not support default values?
This is a blocker for me so I was considering extending Avro C to support
default values myself but I thought I should check with the mailing list
first.
Thanks in advance,
Chris.