twmb commented on code in PR #805:
URL: https://github.com/apache/avro/pull/805#discussion_r841335930
##########
doc/src/content/xdocs/spec.xml:
##########
@@ -1310,6 +1310,92 @@
</ul>
</section>
+ <section>
+ <title>Standard Canonical Form for Schemas</title>
+
+ <p>One of defined way to normalize the avro schema using
+ <em>Standard Canonical Form Transformation</em>. This involves
+ stripping unwanted properties and maintain same canonical
+ ordering. The canonical ordering involves ordering avro
+ reserved properties followed by custom properties if mentioned while
+ transforming. Normalization schema which helps to reduce the
+ total memory size of schema (removed unwanted properties and
whitespace)
+ while transfer avro schema between two system and also reduce the
parsing
+ time for compatibility check and schema evolution.
+ </p>
+
+ <p><em>Standard Canonical Form</em> is a transformation of a schema
+ into standard canonical ordered. It contains only avro reserved
+ properties <code>"name", "type", "fields", "symbols", "items",
"values",
+ "logicalType", "size", "order", "doc", "aliases", "default"</code>
+ and <em>other (custom properties)</em> schema properties.
+ </p>
+
+ <section>
+ <title>Transforming into Standard Canonical Form</title>
+
+ <p>Assuming an input schema (in JSON form) that's already
+ UTF-8 text for a <em>valid</em> Avro schema (including all
+ quotes as required by JSON), the following transformations
+ will produce its Standard Canonical Form:</p>
+ <ul>
+ <li> [PRIMITIVES] Convert primitive schemas to their simple
+ form (e.g., <code>int</code> instead of
+ <code>{"type":"int"}</code>).</li>
+
+ <li> [FULLNAMES] Replace short names with fullnames, using
+ applicable namespaces to do so. Then eliminate
+ <code>namespace</code> attributes, which are now redundant.</li>
+
+ <li> [STRIP] Keep only attributes that are relevant to
+ reserved properties, which are:
+ <code>type</code>, <code>name</code>,
Review Comment:
Size is only relevant for `fixed`, it should not be present in any other
type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]