Hi Chris

The C API doesn't support default values, or schema evolution.  It does
support schema projection, where the reader schema has less fields than
the writer schema.

For this reason we switched to using the C++ API for our particular
project. C++ Schema evolution has only recently been baselined in
version 1.7.7 of Avro.

I did some work on trying to implement it in the C version, but gave up,
as I found the C code quite difficult to work with.

I agree the documentation should be clarified.  Schema projection and
type promotion are a subset of schema resolution - but schema evolution
is definitely missing!


Steve Roehrs

Senior Software Engineer | Lockheed Martin

 

| p: +61 8 7389 4525    | m: +61 4 3891 5622     | f: +61 8 7389 4551

| w: www.rlmgroup.com.au | e: [email protected]

| Company address: 82-86 Woomera Ave, Edinburgh, SA 5111

This email and any attachment to it remains the property of Lockheed
Martin and is intended only to be read or used by the named addressee.
It may contain information that is confidential, commercially valuable
or subject to legal privilege.  If you receive this email in error,
please immediately delete it and notify the sender.  Opinions,
conclusions and other information in this message that do not relate to
the official business of Lockheed Martin or any companies within
Lockheed Martin shall be understood as neither given nor endorsed by
them.

-----Original Message-----
From: capo hatsoft [mailto:[email protected]] 
Sent: Wednesday, July 30, 2014 12:05 PM
To: [email protected]
Subject: Schema default values in C implimentation

I recently extended a tool used at my work by adding an Avro output
module.
The module works fine except that it appears to ignore default values in
the schema.

My program does something like this:

presuming schemaBuffer contains

{"type":"record",
  "name":"test",
  "fields":[
    {"name":"foo", "type":"int"},
    {"name":"hat", "type":"int", "default":12},
    {"name":"bar", "type":"string"}
  ]
}

avro_schema_t schema;
avro_schema_from_json_length(schemaBuffer, schemaLen, &schema);

avro_value_iface_t * iface;
avro_value_t writer_value, field;
avro_file_writer_t avro_writer;

iface = avro_generic_class_from_schema(schema);
avro_generic_value_new(iface, &writer_value);
avro_file_writer_create_with_codec(outFilePath, schema, &avro_writer,
"defalte", blockSize);

//iterate over some data structure containing src data {
  //for int values {
    avro_value_set_int(&field, someIntValue);
  }
  //similar code for other types...
}
avro_file_writer_append_value(avro_writer, &writer_value)


//clean up
//flush etc on program exit

The result is that the program correctly creates an avro encoded file
with
one record for each of my input records with all the correct values
etc....

Except! The schema at the top of the output file created is different to
the input schema. It now looks like:

{"type":"record",
  "name":"test",
  "fields":[
    {"name":"foo", "type":"int"},
    {"name":"hat", "type":"int"},
    {"name":"bar", "type":"string"}
  ]
}

The default property just seems to be completely ignored by the schema
parser or otherwise not reproduced by the schema writer.

Having a look at the source code I came across this concerning struct in
schema.h:

struct avro_record_field_t {
        int index;
        char *name;
        avro_schema_t type;
        /*
         * TODO: default values
         */
};

So it appears that default values are not supported by Avro C?

I'm pretty confused however as the documentation at
http://avro.apache.org/docs/1.7.7/api/c/index.html states:

The C implementation supports:

   -

   binary encoding/decoding of all primitive and complex data types
   -

   storage to an Avro Object Container File
   -

   schema resolution, promotion and projection
   -

   validating and non-validating mode for writing Avro data

The C implementation is lacking:

   -

   RPC

Is the documentation wrong or am I just missing something?

I couldn't find any evidence that default values are supported after
reading over the source. If this feature still planned to be
implemented?
Should the documentation be updated to reflect that the C implementation
does not support default values?

This is a blocker for me so I was considering extending Avro C to
support
default values myself but I thought I should check with the mailing list
first.

Thanks in advance,
Chris.

Reply via email to