[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699249#comment-13699249 ] Jeremy Kahn commented on AVRO-1318: --- I prefer the approach you ([~cutting]) suggest -- especially the object aspects of it, and especially if the objects can be derived from {{collections.Sequence}} and {{collections.Mapping}} so that existing accessors can keep working the same way. Unfortunately, I don't have any free cycles for this, though I'd be happy to contribute later in July. I don't know if this should block 1.7.5 release though. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698498#comment-13698498 ] Jeremy Kahn commented on AVRO-1318: --- [~laserson], glad you're game to contribute! - what cleanup do you think is needed? - would it be possible to use your work _without_ requiring a change to the read/write API? (could the old API be preserved in terms of your new one?) Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1343) Python: validate too permissive on records with extra fields
[ https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697022#comment-13697022 ] Jeremy Kahn commented on AVRO-1343: --- It causes problems for unioned data in Python, because Python moves to generic data and then introspects the data with {{validate}} to determine which union member to use to re-encode the data. Suppose I start with a schema: {code}{type: record, name: superset, fields: [ {name: foo, type: int }, {name: bar, type: string} ] } {code} If I encode these two lines with a schema of _only_ {{superset}} objects: {code} {foo: 99, bar: banana} {foo: -98, bar: peaches}{code} the data is entirely recoverable. But if I rewrite that datafile with a schema supporting a union of {{superset}} and {{subset}} {code} [{type: record, name: superset, fields: [ {name: foo, type: int }, {name: bar, type: string} ] }, {type: record, name: subset, fields: [ {name: foo, type: int } ] } ]{code} the data will be re-encoded as {{subset}} objects, silently effectively discarding the {{bar}} field. This behavior seems fundamentally backwards-breaking _as unpatched_, but here's a way we could rewrite it to only affect union member selection: I could rewrite the patch to pass an extra {{strict}} optional (default {{False}}) value to validate, and then to use that {{strict=True}} value when doing union-member-selection. This would, I believe, allow extra fields for simple records, but discard them when determining the correct member. Of course, someone might still be expecting to put things into Python unions with extra fields and depending on the schema to discard these, but I think anyone with that expected behavior would have encountered this bug already. Python: validate too permissive on records with extra fields Key: AVRO-1343 URL: https://issues.apache.org/jira/browse/AVRO-1343 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch Python's validator silently accepts (generic) records with extra fields and considers them valid. For example, {{io.validate}} silently considers that the schema: {noformat}{type: record, name: Test, fields: [{name: f, type: long}]} {noformat} should accept records like: {noformat}{'f': 5, 'extra_field': abc}{noformat} but this is problematic. This is *especially* problematic for encoding unions, because internally the Python serializer uses {{validate}} to find the appropriate schema with which to encode a given object. In the current implementation, union schema selection is the *last* schema that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't picky, this encoding will frequently guess wrong. I will attach two patches: one to the tests and one to the {{validate}} function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697032#comment-13697032 ] Jeremy Kahn commented on AVRO-1318: --- The purpose is roughly the same, if I understand correctly. This fingerprint notion is copied from [~laserson]'s [perf branch|https://github.com/laserson/avro/tree/perf] to avoid recomputation of evolution decisions (to to cache encoder and decoder objects, quoting the spec). This delta does most of the parsing canonical form part of the spec, if I understand correctly, but should be reviewed in light of that, for sure. I've found Uri's work on this useful to support Cython extensions, but adapting the Python decoder and encoder to cache those encoders and decoders is a pretty big change. I thought this one bit should be safe enough to include without requiring a 1.8.0 bump, so I pushed it forward as a proposal. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697076#comment-13697076 ] Jeremy Kahn commented on AVRO-1318: --- It is not an end-user use case. it's a useful performance win, if you're caching encoder and decoder objects, as Uri's changes do. I've written my own extensions based heavily on this signature behavior. I'd be happy to have [~laserson]'s perf branch added. [~cutting], perhaps you can go chat with him about this? He's a Clouderian, if I understand correctly. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1343) Python: validate too permissive on records with extra fields
[ https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1343 started by Jeremy Kahn. Python: validate too permissive on records with extra fields Key: AVRO-1343 URL: https://issues.apache.org/jira/browse/AVRO-1343 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Python's validator silently accepts (generic) records with extra fields and considers them valid. For example, {{io.validate}} silently considers that the schema: {noformat}{type: record, name: Test, fields: [{name: f, type: long}]} {noformat} should accept records like: {noformat}{'f': 5, 'extra_field': abc}{noformat} but this is problematic. This is *especially* problematic for encoding unions, because internally the Python serializer uses {{validate}} to find the appropriate schema with which to encode a given object. In the current implementation, union schema selection is the *last* schema that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't picky, this encoding will frequently guess wrong. I will attach two patches: one to the tests and one to the {{validate}} function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1343) Python: validate too permissive on records with extra fields
Jeremy Kahn created AVRO-1343: - Summary: Python: validate too permissive on records with extra fields Key: AVRO-1343 URL: https://issues.apache.org/jira/browse/AVRO-1343 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Python's validator silently accepts (generic) records with extra fields and considers them valid. For example, {{io.validate}} silently considers that the schema: {noformat}{type: record, name: Test, fields: [{name: f, type: long}]} {noformat} should accept records like: {noformat}{'f': 5, 'extra_field': abc}{noformat} but this is problematic. This is *especially* problematic for encoding unions, because internally the Python serializer uses {{validate}} to find the appropriate schema with which to encode a given object. In the current implementation, union schema selection is the *last* schema that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't picky, this encoding will frequently guess wrong. I will attach two patches: one to the tests and one to the {{validate}} function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1343) Python: validate too permissive on records with extra fields
[ https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1343: -- Attachment: AVRO-1343-validate.patch AVRO-1343-tests.patch Python: validate too permissive on records with extra fields Key: AVRO-1343 URL: https://issues.apache.org/jira/browse/AVRO-1343 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch Python's validator silently accepts (generic) records with extra fields and considers them valid. For example, {{io.validate}} silently considers that the schema: {noformat}{type: record, name: Test, fields: [{name: f, type: long}]} {noformat} should accept records like: {noformat}{'f': 5, 'extra_field': abc}{noformat} but this is problematic. This is *especially* problematic for encoding unions, because internally the Python serializer uses {{validate}} to find the appropriate schema with which to encode a given object. In the current implementation, union schema selection is the *last* schema that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't picky, this encoding will frequently guess wrong. I will attach two patches: one to the tests and one to the {{validate}} function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1343) Python: validate too permissive on records with extra fields
[ https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1343: -- Status: Patch Available (was: In Progress) I hope these patches can be accepted into 1.7.5. Python: validate too permissive on records with extra fields Key: AVRO-1343 URL: https://issues.apache.org/jira/browse/AVRO-1343 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch Python's validator silently accepts (generic) records with extra fields and considers them valid. For example, {{io.validate}} silently considers that the schema: {noformat}{type: record, name: Test, fields: [{name: f, type: long}]} {noformat} should accept records like: {noformat}{'f': 5, 'extra_field': abc}{noformat} but this is problematic. This is *especially* problematic for encoding unions, because internally the Python serializer uses {{validate}} to find the appropriate schema with which to encode a given object. In the current implementation, union schema selection is the *last* schema that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't picky, this encoding will frequently guess wrong. I will attach two patches: one to the tests and one to the {{validate}} function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1331) Java reader backwards-compatibility breakage
Jeremy Kahn created AVRO-1331: - Summary: Java reader backwards-compatibility breakage Key: AVRO-1331 URL: https://issues.apache.org/jira/browse/AVRO-1331 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Jeremy Kahn Attachments: stripped-snipped.avro, stripped-snipped.schema For some cases where we encode Avro data with Avro 1.7.4, it is not readable with Avro 1.7.5-SNAPSHOT post AVRO-1295: the Java decoder is unable to discover the root definitions Among the properties of (some) schemas that trigger this failure: - an explicit empty string in the root namespace and - uses other namespaces elsewhere in the schema, - has a recursive reference to the root A sample schema and a sample datafile with one example encoded with that schema are attached. This datafile cannot be read with Java deserializers (and I believe that the schema cannot be parsed by the Java schema parser). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1331) Java reader backwards-compatibility breakage
[ https://issues.apache.org/jira/browse/AVRO-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1331: -- Attachment: stripped-snipped.schema stripped-snipped.avro the sample stripped and snipped files trigger this misbehavior with versions of 1.7.5-SNAPSHOT that include patch AVRO-1295. Java reader backwards-compatibility breakage Key: AVRO-1331 URL: https://issues.apache.org/jira/browse/AVRO-1331 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Jeremy Kahn Attachments: stripped-snipped.avro, stripped-snipped.schema For some cases where we encode Avro data with Avro 1.7.4, it is not readable with Avro 1.7.5-SNAPSHOT post AVRO-1295: the Java decoder is unable to discover the root definitions Among the properties of (some) schemas that trigger this failure: - an explicit empty string in the root namespace and - uses other namespaces elsewhere in the schema, - has a recursive reference to the root A sample schema and a sample datafile with one example encoded with that schema are attached. This datafile cannot be read with Java deserializers (and I believe that the schema cannot be parsed by the Java schema parser). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652062#comment-13652062 ] Jeremy Kahn commented on AVRO-1318: --- Nudging this issue to ask for review from a Pythonista and/or a committer. It'd be great if AVRO-1318 and AVRO-1323 could be included in Avro 1.7.5 release. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1318 started by Jeremy Kahn. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1318) Python schema should store fingerprints
Jeremy Kahn created AVRO-1318: - Summary: Python schema should store fingerprints Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1323) Python request schemas should report fullname
Jeremy Kahn created AVRO-1323: - Summary: Python request schemas should report fullname Key: AVRO-1323 URL: https://issues.apache.org/jira/browse/AVRO-1323 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Priority: Minor Avro request objects in the Python library are treated as a special kind of record schema without a name. But such objects should have a name -- if nothing else, they should have the same name as the message that they belong to. Blocks AVRO-1318, in which fingerprints require every schema type -- including requests -- to report a fingerprint (usually their name). It's an easy fix; I'll attach a patch in a few minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (AVRO-1323) Python request schemas should report fullname
[ https://issues.apache.org/jira/browse/AVRO-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn reassigned AVRO-1323: - Assignee: Jeremy Kahn Python request schemas should report fullname - Key: AVRO-1323 URL: https://issues.apache.org/jira/browse/AVRO-1323 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Avro request objects in the Python library are treated as a special kind of record schema without a name. But such objects should have a name -- if nothing else, they should have the same name as the message that they belong to. Blocks AVRO-1318, in which fingerprints require every schema type -- including requests -- to report a fingerprint (usually their name). It's an easy fix; I'll attach a patch in a few minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1323) Python request schemas should report fullname
[ https://issues.apache.org/jira/browse/AVRO-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1323 started by Jeremy Kahn. Python request schemas should report fullname - Key: AVRO-1323 URL: https://issues.apache.org/jira/browse/AVRO-1323 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Avro request objects in the Python library are treated as a special kind of record schema without a name. But such objects should have a name -- if nothing else, they should have the same name as the message that they belong to. Blocks AVRO-1318, in which fingerprints require every schema type -- including requests -- to report a fingerprint (usually their name). It's an easy fix; I'll attach a patch in a few minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1323) Python request schemas should report fullname
[ https://issues.apache.org/jira/browse/AVRO-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1323: -- Attachment: AVRO-1323.patch Python request schemas should report fullname - Key: AVRO-1323 URL: https://issues.apache.org/jira/browse/AVRO-1323 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Attachments: AVRO-1323.patch Avro request objects in the Python library are treated as a special kind of record schema without a name. But such objects should have a name -- if nothing else, they should have the same name as the message that they belong to. Blocks AVRO-1318, in which fingerprints require every schema type -- including requests -- to report a fingerprint (usually their name). It's an easy fix; I'll attach a patch in a few minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1323) Python request schemas should report fullname
[ https://issues.apache.org/jira/browse/AVRO-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1323: -- Status: Patch Available (was: In Progress) Patch work done [here|https://github.com/jkahn/avro/tree/AVRO-1323]. Python request schemas should report fullname - Key: AVRO-1323 URL: https://issues.apache.org/jira/browse/AVRO-1323 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Attachments: AVRO-1323.patch Avro request objects in the Python library are treated as a special kind of record schema without a name. But such objects should have a name -- if nothing else, they should have the same name as the message that they belong to. Blocks AVRO-1318, in which fingerprints require every schema type -- including requests -- to report a fingerprint (usually their name). It's an easy fix; I'll attach a patch in a few minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1318: -- Attachment: AVRO-1318.patch Patch from [here|https://github.com/jkahn/avro/tree/AVRO-1318]. A copy of changes from [~laserson] for fingerprinting files. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1318: -- Status: Patch Available (was: In Progress) Tests pass {{ant clean build}} for me. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1318) Python schema should store fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650219#comment-13650219 ] Jeremy Kahn commented on AVRO-1318: --- Oh, to be clear: when AVRO-1323 is included, then AVRO-1318 passes. The AVRO-1318 changes trigger the bug that AVRO-1323 addresses. Python schema should store fingerprints --- Key: AVRO-1318 URL: https://issues.apache.org/jira/browse/AVRO-1318 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Attachments: AVRO-1318.patch Python schema objects need to produce a simple representation that demonstrates their field identity. {avro.schema.Schema} objects need to provide a {fingerprint} member field to enable quick checking of schema matching (even when the schema has other, possibly changed decoration). Based on a patch pulled from [~laserson]'s proposed changes to make a collection of C-typing hints. These changes will be backwards-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1316) IDL code-generation generates too-long literals for very large schemas
[ https://issues.apache.org/jira/browse/AVRO-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648060#comment-13648060 ] Jeremy Kahn commented on AVRO-1316: --- Scott's right about reducing the character count to 2^14: UTF 8 characters may be up to four bytes each (though that is a [gross overestimate|http://stackoverflow.com/questions/9533258/what-is-the-maximum-number-of-bytes-for-a-utf-8-encoded-character]. I think it would be more likely to be readable in 2^14 character chunks, too. IDL code-generation generates too-long literals for very large schemas -- Key: AVRO-1316 URL: https://issues.apache.org/jira/browse/AVRO-1316 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: AVRO-1316.patch When I work from a very large IDL schema, the Java code generated includes a schema JSON literal that exceeds the length of the maximum allowed literal string ([65535 characters|http://stackoverflow.com/questions/8323082/size-of-initialisation-string-in-java]). This creates weird Maven errors like: {{[ERROR] ...FooProtocol.java:[13,89] constant string too long}}. It might seem a little crazy, but a 64-kilobyte JSON protocol isn't outrageous at all for some of the more involved data structures, especially if we're including documentation strings etc. I believe the fix should be a bit more sensitivity to the length of the JSON literal (and a willingness to split it into more than one literal, joined by {{+}}), but I haven't figured out where that change needs to go. Has anyone else encountered this problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1316) IDL code-generation generates too-long literals for very large schemas
Jeremy Kahn created AVRO-1316: - Summary: IDL code-generation generates too-long literals for very large schemas Key: AVRO-1316 URL: https://issues.apache.org/jira/browse/AVRO-1316 Project: Avro Issue Type: Bug Components: java Reporter: Jeremy Kahn Priority: Minor When I work from a very large IDL schema, the Java code generated includes a schema JSON literal that exceeds the length of the maximum allowed literal string ([65535 characters|http://stackoverflow.com/questions/8323082/size-of-initialisation-string-in-java]). This creates weird Maven errors like: {{[ERROR] ...FooProtocol.java:[13,89] constant string too long}}. It might seem a little crazy, but a 64-kilobyte JSON protocol isn't outrageous at all for some of the more involved data structures, especially if we're including documentation strings etc. I believe the fix should be a bit more sensitivity to the length of the JSON literal (and a willingness to split it into more than one literal, joined by {{+}}), but I haven't figured out where that change needs to go. Has anyone else encountered this problem? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1296: -- Resolution: Fixed Status: Resolved (was: Patch Available) Philip Zeyliger merged the patch, and a followup patch that restored test functionality for Python 2.6. Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13638196#comment-13638196 ] Jeremy Kahn commented on AVRO-1296: --- Philip, have you received any objections? Could you commit this to trunk? Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1304) Python Avro match_schemas called redundantly
[ https://issues.apache.org/jira/browse/AVRO-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13638198#comment-13638198 ] Jeremy Kahn commented on AVRO-1304: --- Uri, what strategy are you using to try to fix this? Could we memoize the partner schema to short-circuit out of match_schemas (trading a small amount of memory for speed)? I'm eager to improve the speed of the Python library, and a 20% speedup could shave days off my team's product delivery. Contact me offline (jer...@trochee.net) if you'd like to share your profiling setup (I can try to implement related speedups). Python Avro match_schemas called redundantly Key: AVRO-1304 URL: https://issues.apache.org/jira/browse/AVRO-1304 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Uri Laserson DatumReader.match_schemas(writers_schema, readers_schema) is called on every single read from the DatumReader. However, for almost every read, the schemas used are the object members self.writers_schema and self.readers_schema. match_schemas should be checked only once in this case, and only when the object members are modified. This takes up 20% of my parse time upon profiling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13638407#comment-13638407 ] Jeremy Kahn commented on AVRO-1296: --- Looks like the Ubuntu 9.10 buildbot complains about this test patch. Updated test code is included here: https://github.com/jkahn/avro/commit/9724cd0e17f338db6a12ebc1fce5132cdf934bc7 {noformat} @@ -379,7 +379,7 @@ def test_inner_namespace_not_rendered(self): self.assertEqual('com.acme.Greeting', proto.types[0].fullname) self.assertEqual('Greeting', proto.types[0].name) # but there shouldn't be 'namespace' rendered to json on the inner type -self.assertNotIn('namespace', proto.to_json()['types'][0]) +self.assertFalse('namespace' in proto.to_json()['types'][0]) def test_valid_cast_to_string_after_parse(self): {noformat} Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1303) Python avro library does not support aliasing for schema evolution
Jeremy Kahn created AVRO-1303: - Summary: Python avro library does not support aliasing for schema evolution Key: AVRO-1303 URL: https://issues.apache.org/jira/browse/AVRO-1303 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn as discussed [on the mailing list|http://mail-archives.apache.org/mod_mbox/avro-user/201304.mbox/%3CCALEq1Z-ncmjLjvCCLeEgm%2BQvMmPAg5%2B0pVW%3De1N-%3DxtQcMApPw%40mail.gmail.com%3E], the Python {{avro}} libraries don't support aliases. (the string {{alias}} is found nowhere in the Python source code. We should update the Python library to accept aliases in matching schemas for: * field names * named types -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1304) Python Avro match_schemas called redundantly
[ https://issues.apache.org/jira/browse/AVRO-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637061#comment-13637061 ] Jeremy Kahn commented on AVRO-1304: --- This would be super useful to fix. Do you have a patch prepared? Python Avro match_schemas called redundantly Key: AVRO-1304 URL: https://issues.apache.org/jira/browse/AVRO-1304 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Uri Laserson DatumReader.match_schemas(writers_schema, readers_schema) is called on every single read from the DatumReader. However, for almost every read, the schemas used are the object members self.writers_schema and self.readers_schema. match_schemas should be checked only once in this case, and only when the object members are modified. This takes up 20% of my parse time upon profiling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633308#comment-13633308 ] Jeremy Kahn commented on AVRO-1296: --- Commenting to nudge this issue. Can somebody review these Python patches? It's a small change but it fixes a fairly serious obstacle to using Avro files as a Java/Python interlingua for on-disk storage. Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
Jeremy Kahn created AVRO-1296: - Summary: Python: schemas retrieved from protocol types ignore namespace Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1296 started by Jeremy Kahn. Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1296: -- Attachment: AVRO-1296b.patch AVRO-1296a.patch AVRO-1296a and AVRO-1296b are the two patches mentioned in the OP. Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1296) Python: schemas retrieved from protocol types ignore namespace
[ https://issues.apache.org/jira/browse/AVRO-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1296: -- Status: Patch Available (was: In Progress) Python tests pass {{ant clean build test}} after each of these patches are included. Each patch includes new tests that fail before and succeed after. Python: schemas retrieved from protocol types ignore namespace -- Key: AVRO-1296 URL: https://issues.apache.org/jira/browse/AVRO-1296 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.4 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Fix For: 1.7.5 Attachments: AVRO-1296a.patch, AVRO-1296b.patch If I parse a protocol {{p}} using {{avro.protocol.parse}}, which defines {{namespace: ns}} and then retrieve a child schema {{s}} from the protocol's {{proto.types}} (or {{proto.types_dict}}), then {{s}} does not have its namespace set (to {{ns}}), even if {{p}} has a namespace. This is particularly problematic if I'm using {{s}} to write out an avro file intended to be read by a specific-type reader, because the file header will claim to be objects of type {{s}} (not {{ns.s}}, as expected). I've attached two patches: one that makes sure that the {{namespace}} property of protocol types is set to the default namespace of the protocol when not otherwise set. The second patch ensures that the {{namespace}} is *not* rendered into JSON when a default protocol specifies the right value already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1291) Python library missing strict JSON encode/decoe
Jeremy Kahn created AVRO-1291: - Summary: Python library missing strict JSON encode/decoe Key: AVRO-1291 URL: https://issues.apache.org/jira/browse/AVRO-1291 Project: Avro Issue Type: Bug Components: python Reporter: Jeremy Kahn The Python Avro libraries don't actually have a proper JSON decoder or encoder, because they don't handle the [type-hinting for unions|http://avro.apache.org/docs/current/spec.html#json_encoding] properly. The Python {{avro.io}} library should provide a pair of {{StrictJsonEncoder,StrictJsonDecoder}}} classes that correctly include (and decode) the type hints when the schema expects a union. Jonathan Coveney [raised this concern|http://mail-archives.apache.org/mod_mbox/avro-user/201304.mbox/%3CCAKne9Z6nkYXwb4QzPr4qNyH1o7TnL1674MspgnHuKMuD2imguQ%40mail.gmail.com%3E] on the Avro User mailing list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1289) Python: Schema objects should polymorphically interact with data-walker interface
Jeremy Kahn created AVRO-1289: - Summary: Python: Schema objects should polymorphically interact with data-walker interface Key: AVRO-1289 URL: https://issues.apache.org/jira/browse/AVRO-1289 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.5 Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.8.0 Python {{avro.schema}} objects should be able to call back to a general data-and-schema parallel-walker (validate would be one of those, but so could be default-filler). There should be an {{avro.walker}} interface that owns the parallel state (a datum-reader/deserializer, a datum-writer/serializer, a validator, or a default-filler -- see AVRO-1265). Schema polymorphism would allow us to eliminate the large (and highly redundant) function-dispatch methods in {{avro.io}} by making the {{avro.schema.Schema}} subclass responsible for calling back to the {{avro.walker}} object. Assigning this to v1.8.0 because it may be difficult to duplicate *every* behavior of 1.7.* with the same function signatures, especially where this refactor may be eliminate entire classes. This factoring ought to make it easier to improve or extend objects that meet this {{walker}} interface -- validators serializers might be able to store more state about their position within a record, for example, to yield more informative error messages upon mismatch (as requested by Jonathan Coveney on the user mailing list). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1286) Python script avro cat should be able to read from stdin
[ https://issues.apache.org/jira/browse/AVRO-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620615#comment-13620615 ] Jeremy Kahn commented on AVRO-1286: --- Biggest headache here is that the python avro data file library requires that the file be seekable. Standard in is not seekable. I think this is a bug or a misfeature in the python library and probably deserves a ticket of its own. Python script avro cat should be able to read from stdin Key: AVRO-1286 URL: https://issues.apache.org/jira/browse/AVRO-1286 Project: Avro Issue Type: Bug Components: python Reporter: Uri Laserson Priority: Minor Currently, you have to specify a target file on the command line. But it would be nice to be able to stream data through avro cat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1284: -- Labels: patch (was: ) Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: patch Fix For: 1.7.5 Attachments: validation-as-method-backwards-compatible.patch, validation-as-method.patch In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work stopped] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1284 stopped by Jeremy Kahn. Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 Attachments: validation-as-method-backwards-compatible.patch, validation-as-method.patch In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1284: -- Status: Patch Available (was: Open) Seems to be a working fix. tests pass. Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: patch Fix For: 1.7.5 Attachments: validation-as-method-backwards-compatible.patch, validation-as-method.patch In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1265) Python: schema objects should support builder() default-filling behavior
[ https://issues.apache.org/jira/browse/AVRO-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1265: -- Attachment: avro-1265b-tests.patch avro-1265a-build-defaults.patch Implement default-build behavior on schema and update tests to do (rather cursory) testing of this behavior Python: schema objects should support builder() default-filling behavior Key: AVRO-1265 URL: https://issues.apache.org/jira/browse/AVRO-1265 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Fix For: 1.7.5 Attachments: avro-1265a-build-defaults.patch, avro-1265b-tests.patch There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with missing default values and have them transmitted in well-formed avro binary. If you fill in the missing default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that fill defaults on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). For backwards compatibility (and probably to avoid unnecessary data traversal), clients probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to set only the non-default values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1284) Python: validation should be a method of Schema objects
Jeremy Kahn created AVRO-1284: - Summary: Python: validation should be a method of Schema objects Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1284 started by Jeremy Kahn. Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1284: -- Attachment: validation-as-method.patch Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 Attachments: validation-as-method.patch In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1284) Python: validation should be a method of Schema objects
[ https://issues.apache.org/jira/browse/AVRO-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1284: -- Attachment: validation-as-method-backwards-compatible.patch The {{validation-as-method-backwards-compatible}} patch maintains the functional behavior of {{avro.io.validate}} by calling the method indirectly, in case users are calling {{avro.io.validate}}. Prefer this patch to the simpler {{validation-as-method}} patch. Python: validation should be a method of Schema objects --- Key: AVRO-1284 URL: https://issues.apache.org/jira/browse/AVRO-1284 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 Attachments: validation-as-method-backwards-compatible.patch, validation-as-method.patch In Python, validation of a datum by the schema was done in {{avro.io.validate}} function. The {{avro.io.validate}} function is a complex, recursively-called switch statement. Instead of calling a two-argument {{avro.io.validate}} with a Schema object and a datum, it is easier to understand and extend if they are one-argument methods on the schema. I (Jeremy) have written a patch that implements {{validate}} methods on Schema objects. This patch will form the prerequisite for AVRO-1265 (see easier to extend above). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (AVRO-1265) Python: schema objects should support builder() default-filling behavior
[ https://issues.apache.org/jira/browse/AVRO-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AVRO-1265 started by Jeremy Kahn. Python: schema objects should support builder() default-filling behavior Key: AVRO-1265 URL: https://issues.apache.org/jira/browse/AVRO-1265 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Fix For: 1.7.5 There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with missing default values and have them transmitted in well-formed avro binary. If you fill in the missing default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that fill defaults on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). For backwards compatibility (and probably to avoid unnecessary data traversal), clients probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to set only the non-default values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1265) Python: schema objects should support builder() default-filling behavior
Jeremy Kahn created AVRO-1265: - Summary: Python: schema objects should support builder() default-filling behavior Key: AVRO-1265 URL: https://issues.apache.org/jira/browse/AVRO-1265 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Fix For: 1.7.5 There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with missing default values and have them transmitted in well-formed avro binary. If you fill in the missing default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that fill defaults on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). For backwards compatibility (and probably to avoid unnecessary data traversal), clients probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to set only the non-default values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1265) Python: schema objects should support builder() default-filling behavior
[ https://issues.apache.org/jira/browse/AVRO-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589764#comment-13589764 ] Jeremy Kahn commented on AVRO-1265: --- see [this thread|http://mail-archives.apache.org/mod_mbox/avro-user/201302.mbox/%3cca+i_aek0-rofp5fmwte7at0jyzhrvsq9nmjubvovrkbex6m...@mail.gmail.com%3E] on the mailing list. Python: schema objects should support builder() default-filling behavior Key: AVRO-1265 URL: https://issues.apache.org/jira/browse/AVRO-1265 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Fix For: 1.7.5 There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with missing default values and have them transmitted in well-formed avro binary. If you fill in the missing default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that fill defaults on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). For backwards compatibility (and probably to avoid unnecessary data traversal), clients probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to set only the non-default values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1265) Python: schema objects should support builder() default-filling behavior
[ https://issues.apache.org/jira/browse/AVRO-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589970#comment-13589970 ] Jeremy Kahn commented on AVRO-1265: --- Here's [where development is happening|https://github.com/jkahn/avro/tree/feature/fill-defaults] for this ticket. I need to add tests, and won't propose a patch until I have them. I'm hoping to find time to write the tests tomorrow or early next week. Python: schema objects should support builder() default-filling behavior Key: AVRO-1265 URL: https://issues.apache.org/jira/browse/AVRO-1265 Project: Avro Issue Type: Improvement Components: python Reporter: Jeremy Kahn Assignee: Jeremy Kahn Priority: Minor Labels: features Fix For: 1.7.5 There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with missing default values and have them transmitted in well-formed avro binary. If you fill in the missing default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that fill defaults on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). For backwards compatibility (and probably to avoid unnecessary data traversal), clients probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to set only the non-default values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
Jeremy Kahn created AVRO-1255: - Summary: Python schema (message, protocol) to_json names argument should be optional Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}} which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1255: -- Attachment: avro-1255.patch Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}} which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13578606#comment-13578606 ] Jeremy Kahn commented on AVRO-1255: --- {{cd lang/py ant build test}} passes all the tests with this patch applied, AFAICT. Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}} which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1255: -- Description: The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}}) which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. was: The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}} which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}}) which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13578702#comment-13578702 ] Jeremy Kahn commented on AVRO-1255: --- Will add in some tests to generate the generic schema without the names parameter (exercising this new function, and send a second patch unifying the changes. Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}}) which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Kahn updated AVRO-1255: -- Attachment: avro-1255-b.patch Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255-b.patch, avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}}) which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1255) Python schema (message, protocol) to_json names argument should be optional
[ https://issues.apache.org/jira/browse/AVRO-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13578729#comment-13578729 ] Jeremy Kahn commented on AVRO-1255: --- new patch added. Rather than adding in new tests, I discovered that several stringification functions (used throughout the tests) could be simplified with this access. The new patch (1255-b) simplifies those stringification methods in just that way, so the new behavior is well-exercised by the tests. Python schema (message, protocol) to_json names argument should be optional --- Key: AVRO-1255 URL: https://issues.apache.org/jira/browse/AVRO-1255 Project: Avro Issue Type: Improvement Components: python Affects Versions: 1.7.3 Reporter: Jeremy Kahn Priority: Minor Labels: patch Attachments: avro-1255-b.patch, avro-1255.patch The {{avro.protocol.Protocol}}, {{avro.protocol.Message}}, and various classes in {{avro.schema}} all support a {{to_json}} method which renders the data in Python generics (easily renderable to json). These methods all take a required {{names}} argument (of type {{avro.schema.Names}}) which stores state representing what types have already been rendered. For debugging -- and for other uses of the schema -- it is helpful if the {{names}} argument is optional. When it is not provided, each method should construct an empty {{schema.Names}} object internally. {{to_json}} thus can be invoked without argument to get the relevant rendering of the current schema in isolation. Patch to be attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira