Hello!

I have a test.json file that looks like this:

{"first":"John", "last":"Doe", "middle":"C"}
{"first":"John", "last":"Doe"}

(Second line does NOT have a "middle" element).

And I have a test.schema file that looks like this:

{"name":"test",
 "type":"record",
 "fields": [
    {"name":"first",  "type":"string"},
    {"name":"middle", "type":"string", "default":""},
    {"name":"last",   "type":"string"}
]}

I then try to use fromjson, as follows, and it chokes on the second line:

$ java -jar avro-tools-1.7.4.jar fromjson --schema-file test.schema test.json > 
test.avro
Exception in thread "main" org.apache.avro.AvroTypeException: Expected field 
name not found: middle
        at org.apache.avro.io.JsonDecoder.doAction(JsonDecoder.java:477)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at org.apache.avro.io.JsonDecoder.advance(JsonDecoder.java:139)
        at org.apache.avro.io.JsonDecoder.readString(JsonDecoder.java:219)
        at org.apache.avro.io.JsonDecoder.readString(JsonDecoder.java:214)
        at 
org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
        at 
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:348)
        at 
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:341)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
        at 
org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:105)
        at org.apache.avro.tool.Main.run(Main.java:80)
        at org.apache.avro.tool.Main.main(Main.java:69)


The short story is - I need to convert a bunch of JSON where an element may not be present sometimes, in which case I'd want it to default to something sensible, e.g. blank or null.

According to the Schema Resolution "if the reader's record schema has a field that contains a default value, and writer's schema does not have a field with the same name, then the reader should use the default value from its field."

I'm clearly missing something obvious, any help would be appreciated!

Grisha

Reply via email to