Short answer. Use this constructor instead:
/** Construct given writer's and reader's schema. */
public GenericDatumReader(Schema writer, Schema reader) {
Longer answer:
You have to give the GenericDatumReader the EXACT schema that wrote the bytes
that you are trying to parse ("writer's schema").
You can *also* give it another schema you'd like to use ("reader's schema")
that can be different.
Try changing this line of your code:
GenericDatumReader<GenericRecord> r1 = new
GenericDatumReader<GenericRecord>(schema1);
To this:
GenericDatumReader<GenericRecord> r1 = new
GenericDatumReader<GenericRecord>(schema2, schema1); // writer's schema is
"schema2", reader's schema is "schema1"
________________________________
From: Raihan Jamal <[email protected]>
Sent: Wednesday, September 25, 2013 5:10 PM
To: [email protected]
Subject: Deserialize the attributes data using another schema give me wrong
results
I am trying to serialize one of our Attributes Daya using Apache Avro Schema.
Here the attribute name is `e7` and the schema that I am using to serialize it
is `schema2.avsc` which is below.
{
"namespace": "com.avro.test.AvroExperiment",
"type": "record",
"name": "DEMOGRAPHIC",
"doc": "DEMOGRAPHIC data",
"fields": [
{"name": "dob", "type": "string"},
{"name": "gndr", "type": "string"},
{"name": "occupation", "type": "string"},
{"name": "mrtlStatus", "type": "string"},
{"name": "numChldrn", "type": "int"},
{"name": "estInc", "type": "string"},
{"name": "schemaId", "type": "int"},
{"name": "lmd", "type": "long"}
]
}
Below is the code that I am using to serialize the attribute `e7` using above
avro `schema2.avsc`. And I am able to serialize it properly and it works fine...
Schema schema = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema2.avsc")));
GenericRecord record = new GenericData.Record(schema);
record.put("dob", "161913600000");
record.put("gndr", "f");
record.put("occupation", "doctor");
record.put("mrtlStatus", "single");
record.put("numChldrn", 3);
record.put("estInc", "50000");
record.put("schemaId", 20001);
record.put("lmd", 1379814280254L);
GenericDatumWriter<GenericRecord> writer = new
GenericDatumWriter<GenericRecord>(schema);
ByteArrayOutputStream os = new ByteArrayOutputStream();
Encoder e = EncoderFactory.get().binaryEncoder(os, null);
writer.write(record, e);
e.flush();
byte[] byteData = os.toByteArray();
os.close();
Now, I tried deserializing the same `e7` attributes data using the same above
avro schema definition `schema2.avsc` and it also works fine, and I am able to
deserialize it properly.
GenericDatumReader<GenericRecord> r = new
GenericDatumReader<GenericRecord>(schema);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(byteData, null);
GenericRecord result = r.read(null, decoder);
System.out.println(result);
System.out.println(result.get("schemaId"));
System.out.println(result.get("lmd"));
Now I thought, lets deserialize the same attributes data using another avro
schema that I have which is `schema1.avsc` and just extract only `schemaId` and
`lmd` from that. Below is the schema-
{
"namespace": "com.avro.test.AvroExperiment",
"type": "record",
"name": "DEMOGRAPHIC",
"doc": "DEMOGRAPHIC data",
"fields": [
{"name": "schemaId", "type": "int"},
{"name": "lmd", "type": "long"}
]
}
/**
* Deserialize the same byte data using another Avro Schema
*/
Schema schema1 = new
Parser().parse((AvroExperiment.class.getResourceAsStream("/schema1.avsc")));
GenericDatumReader<GenericRecord> r1 = new
GenericDatumReader<GenericRecord>(schema1);
BinaryDecoder decoder1 = DecoderFactory.get().binaryDecoder(byteData, null);
GenericRecord result1 = r1.read(null, decoder1);
System.out.println(result1);
System.out.println(result1.get("schemaId"));
System.out.println(result1.get("lmd"));
But somehow the above code prints out like this which is wrong... I am not sure
what wrong I did?
{"schemaId": 12, "lmd": -25}
12
-25
It should be printing out like this....
{"schemaId": 20001, "lmd": 1379814280254L}
20001
1379814280254L
Can anyone help me what wrong I did?