Hi, Could not reach the users mailing list so testing the dev mailing list. Can't find contact email for Jira account creation. Can't join Slack because I have no ASF email account.
Issue described below. Thanks, Matthew ---------- Forwarded message --------- From: Matthew Chng <[email protected]> Date: Wed, Dec 7, 2022 at 4:18 PM Subject: Possible bug with avro-js with binary serdes and schema resolution To: <[email protected]> Hi all, I am encountering an issue with the avro.js' NPM module where Avro serialized into binary buffers are not readable by a different but compatible reader schema (evolved). This issue is only occurring when using the `toBuffer()` and `fromBuffer()` methods and works as expected when using the `toString()` and `fromString()` JSON serdes methods. The following is an example of an evolving schema with the difference being the additional `gender` field that has a default value. const parentV1Type = avro.parse({ name: 'Parent', type: 'record', fields: [ { name: 'name', type: 'string' } ] }) const parentV2Type = avro.parse({ name: 'Parent', type: 'record', fields: [ { name: 'name', type: 'string' }, { name: 'gender', type: 'string', default: 'unspecified' } ] }) According to https://avro.apache.org/docs/1.11.1/specification/#schema-resolution they should be both backwards/forwards reader compatible. They have these properties. - both schemas are records with the same (unqualified) name - if the writer’s record contains a field with a name not present in the reader’s record, the writer’s value for that field is ignored. - if the reader’s record schema has a field that contains a default value, and writer’s schema does not have a field with the same name, then the reader should use the default value from its field. Testing with the `toString()` and `fromString()` JSON serdes methods indicated as such. I've created a simple test script to produce the issue. Also included test with nested schema. The script is included after the output. The errors encountered are either - truncated buffer; or - trailing data Script output: --- JSON Writer: ParentV1, Reader: ParentV2 --- parentV1Json: {"name":"David"} parentV2ReadFromV1Json: {"name":"David","gender":"unspecified"} --- JSON Writer: ParentV2, Reader: ParentV1 --- parentV2Json: {"name":"David","gender":"Father"} parentV1ReadFromV2Json: {"name":"David"} --- Buffer Writer: ParentV1, Reader: ParentV1 --- parentV1Buffer: <Buffer 0a 44 61 76 69 64> parentV1ReadFromV1Buffer: {"name":"David"} --- Buffer Writer: ParentV2, Reader: ParentV2 --- parentV2Buffer: <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72> parentV2ReadFromV1Buffer: {"name":"David","gender":"Father"} --- Buffer Writer: ParentV1, Reader: ParentV2 --- parentV1Buffer: <Buffer 0a 44 61 76 69 64> parentV2ReadFromV1Buffer: ERROR truncated buffer --- Buffer Writer: ParentV2, Reader: ParentV1 --- parentV2Buffer: <Buffer 0a 44 61 76 69 64 0c 46 61 74 68 65 72> parentV1ReadFromV2Buffer: ERROR trailing data --- JSON Writer: meWithParentV1, Reader: meWithParentV2 --- meWithParentV1Json: {"name":"Davidson","parent":{"name":"David"}} meWithParentV2ReadFromV1Json: {"name":"Davidson","parent":{"name":"David","gender":"unspecified"}} --- JSON Writer: meWithParentV2, Reader: meWithParentV1 --- meWithParentV2Json: {"name":"Davidson","parent":{"name":"David","gender":"Father"}} meWithParentV1ReadFromV2Json: {"name":"Davidson","parent":{"name":"David"}} --- Buffer Writer: meWithParentV1, Reader: meWithParentV1 --- meWithParentV1Buffer: <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61 76 69 64> meWithParentV1ReadFromV1Buffer: {"name":"Davidson","parent":{"name":"David"}} --- Buffer Writer: meWithParentV2, Reader: meWithParentV2 --- meWithParentV2Buffer: <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61 76 69 64 0c 46 61 74 68 65 72> meWithParentV2ReadFromV2Buffer: {"name":"Davidson","parent":{"name":"David","gender":"Father"}} --- Buffer Writer: meWithParentV1, Reader: meWithParentV2 --- meWithParentV1Buffer: <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61 76 69 64> meWithParentV2ReadFromV1Buffer: ERROR truncated buffer --- Buffer Writer: meWithParentV2, Reader: meWithParentV1 --- meWithParentV2Buffer: <Buffer 10 44 61 76 69 64 73 6f 6e 0a 44 61 76 69 64 0c 46 61 74 68 65 72> meWithParentV1ReadFromV2Buffer: ERROR trailing data Script src: const avro = require('avro-js') const { writer } = require('repl') const parentV1Type = avro.parse({ name: 'Parent', type: 'record', fields: [ { name: 'name', type: 'string' } ] }) const parentV2Type = avro.parse({ name: 'Parent', type: 'record', fields: [ { name: 'name', type: 'string' }, { name: 'gender', type: 'string', default: 'unspecified' } ] }) const meSchema = { name: 'Me', type: 'record', fields: [ { name: 'name', type: 'string' }, { name: 'parent', type: 'Parent' } ] } const meWithParentV2Type = avro.parse(meSchema, { registry: { Parent: parentV2Type } }) const meWithParentV1Type = avro.parse(meSchema, { registry: { Parent: parentV1Type } }) const parentV1 = { name: 'David'} const parentV2 = { name: 'David', gender: 'Father' } const meWithParentV1 = { name: 'Davidson', parent: parentV1 } const meWithParentV2 = { name: 'Davidson', parent: parentV2 } console.log("") console.log("--- JSON Writer: ParentV1, Reader: ParentV2 ---") const parentV1Json = parentV1Type.toString(parentV1) console.log("parentV1Json: ", parentV1Json) const parentV2ReadFromV1Json = parentV2Type.fromString(parentV1Json) console.log("parentV2ReadFromV1Json: ", JSON.stringify( parentV2ReadFromV1Json)) console.log("") console.log("--- JSON Writer: ParentV2, Reader: ParentV1 ---") const parentV2Json = parentV2Type.toString(parentV2) console.log("parentV2Json: ", parentV2Json) const parentV1ReadFromV2Json = parentV1Type.fromString(parentV2Json) console.log("parentV1ReadFromV2Json: ", JSON.stringify( parentV1ReadFromV2Json)) console.log("") console.log("--- Buffer Writer: ParentV1, Reader: ParentV1 ---") const parentV1Buffer = parentV1Type.toBuffer(parentV1) console.log("parentV1Buffer: ", parentV1Buffer) const parentV1ReadFromV1Buffer = parentV1Type.fromBuffer(parentV1Buffer) console.log("parentV1ReadFromV1Buffer: ", JSON.stringify( parentV1ReadFromV1Buffer)) console.log("") console.log("--- Buffer Writer: ParentV2, Reader: ParentV2 ---") const parentV2Buffer = parentV2Type.toBuffer(parentV2) console.log("parentV2Buffer: ", parentV2Buffer) const parentV2ReadFromV2Buffer = parentV2Type.fromBuffer(parentV2Buffer) console.log("parentV2ReadFromV1Buffer: ", JSON.stringify( parentV2ReadFromV2Buffer)) console.log("") console.log("--- Buffer Writer: ParentV1, Reader: ParentV2 ---") console.log("parentV1Buffer: ", parentV1Buffer) try { const parentV2ReadFromV1Buffer = parentV2Type.fromBuffer(parentV1Buffer) console.log("parentV2ReadFromV1Buffer: ", JSON.stringify( parentV2ReadFromV1Buffer)) } catch (e) { console.log("parentV2ReadFromV1Buffer: ERROR ", e.message) } console.log("") console.log("--- Buffer Writer: ParentV2, Reader: ParentV1 ---") console.log("parentV2Buffer: ", parentV2Buffer) try { const parentV1ReadFromV2Buffer = parentV1Type.fromBuffer(parentV2Buffer) console.log("parentV1ReadFromV2Buffer: ", JSON.stringify( parentV1ReadFromV2Buffer)) } catch (e) { console.log("parentV1ReadFromV2Buffer: ERROR ", e.message) } console.log("") console.log("--- JSON Writer: meWithParentV1, Reader: meWithParentV2 ---") const meWithParentV1Json = meWithParentV1Type.toString(meWithParentV1) console.log("meWithParentV1Json: ", meWithParentV1Json) const meWithParentV2ReadFromV1Json = meWithParentV2Type.fromString( meWithParentV1Json) console.log("meWithParentV2ReadFromV1Json: ", JSON.stringify( meWithParentV2ReadFromV1Json)) console.log("") console.log("--- JSON Writer: meWithParentV2, Reader: meWithParentV1 ---") const meWithParentV2Json = meWithParentV2Type.toString(meWithParentV2) console.log("meWithParentV2Json: ", meWithParentV2Json) const meWithParentV1ReadFromV2Json = meWithParentV1Type.fromString( meWithParentV2Json) console.log("meWithParentV1ReadFromV2Json: ", JSON.stringify( meWithParentV1ReadFromV2Json)) console.log("") console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV1 ---") const meWithParentV1Buffer = meWithParentV1Type.toBuffer(meWithParentV1) console.log("meWithParentV1Buffer: ", meWithParentV1Buffer) const meWithParentV1ReadFromV1Buffer = meWithParentV1Type.fromBuffer( meWithParentV1Buffer) console.log("meWithParentV1ReadFromV1Buffer: ", JSON.stringify( meWithParentV1ReadFromV1Buffer)) console.log("") console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV2 ---") const meWithParentV2Buffer = meWithParentV2Type.toBuffer(meWithParentV2) console.log("meWithParentV2Buffer: ", meWithParentV2Buffer) const meWithParentV2ReadFromV2Buffer = meWithParentV2Type.fromBuffer( meWithParentV2Buffer) console.log("meWithParentV2ReadFromV2Buffer: ", JSON.stringify( meWithParentV2ReadFromV2Buffer)) console.log("") console.log("--- Buffer Writer: meWithParentV1, Reader: meWithParentV2 ---") console.log("meWithParentV1Buffer: ", meWithParentV1Buffer) try { const meWithParentV2ReadFromV1Buffer = meWithParentV2Type.fromBuffer( meWithParentV1Buffer) console.log("meWithParentV2ReadFromV1Buffer: ", JSON.stringify( meWithParentV2ReadFromV1Buffer)) } catch (e) { console.log("meWithParentV2ReadFromV1Buffer: ERROR ", e.message) } console.log("") console.log("--- Buffer Writer: meWithParentV2, Reader: meWithParentV1 ---") console.log("meWithParentV2Buffer: ", meWithParentV2Buffer) try { const meWithParentV1ReadFromV2Buffer = meWithParentV1Type.fromBuffer( meWithParentV2Buffer) console.log("meWithParentV1ReadFromV2Buffer: ", JSON.stringify( meWithParentV1ReadFromV2Buffer)) } catch (e) { console.log("meWithParentV1ReadFromV2Buffer: ERROR ", e.message) } If this is indeed a bug, how do I create a ticket in the Jira board to report it? Thanks. Matthew
