Thanks for the informative reply. I look forward to the example code, that is exactly what I'm after.
I'm really struggling with my schema evolution testing. I thought I'd post a question about schema projection because it seemed simpler but I guess it also rests on creating a resolver. I have not found a clear and simple example of how to do it using avro-c. I've trawled the test code for examples but as I mention I can't find a clear and simple example. I realise that the majority of Avro usage appears to be in Java however I need to use Avro-c for my assessment of Avro because a large portion of our system uses C. Thanks for your help. Chris On Fri, Mar 1, 2013 at 7:31 AM, Douglas Creager <[email protected]>wrote: > > There doesn't seem to be much information available on how to perform > > these tasks. The examples on the C API page confusingly mix the old > > datum API with the new value API. > > Apologies for that — you're absolutely right that we need to clean up > the C API documentation a bit. > > > Is this how schema projection is supposed to work? Does it just return > > items of the same type irrespective of the field name specified? > > tl;dr — The schema projection doesn't happen for free; you need to use a > "resolved writer" to perform the schema resolution. > > In the C API, when you open an Avro file for reading, we expect that the > avro_value_t that you pass in to avro_file_reader_read_value has the > *exact same* schema that was used to create the file. So in your first > example (gist 5056626), your read_archive_test function works great > since it's explicitly asking the file for the writer schema, and using > that to create the value instance to read into. If you know that you > want to read exactly what's in the file, not perform any schema > resolution, and (optionally) dynamically interrogate the writer schema > to see what fields are available, this is exactly the right approach. > > On the other hand, if you want to use schema resolution to project away > some of the fields (or to do other interesting data conversions), you > need to create a resolved writer to perform that schema resolution. The > resolved writer is an avro_value_iface_t that wraps up the schema > resolution rules for a particular writer schema and reader schema. When > you create an avro_value_t instance of the resolved writer, it looks > like it's an instance of the writer schema, and it wraps an instance of > the reader schema. Since the resolved writer value is an instance of > the writer schema, you can read data into it using > avro_file_reader_read_value. Under the covers, it will perform the > schema resolution and fill in the wrapped reader schema instance. You > can then read the projected data out of your reader value. > > In English that's probably still a bit too dense of an explanation; I'll > whip together an example program and post it as a gist so that you can > see it in actual code. > > (As an aside, the reason original projection_test worked the way that it > did is because a single "record { int, int }" value happens to have the > same serialization as two consecutive "int" values. > avro_file_reader_read_value doesn't do any schema resolution, it just > tries to read a value of the type that you pass in.) > > cheers > –doug > >
