[avro] branch branch-1.11 updated: docs: Change index.md to add a schema for data blocks (#2042)

rskraba Wed, 01 Mar 2023 11:04:22 -0800

This is an automated email from the ASF dual-hosted git repository.

rskraba pushed a commit to branch branch-1.11
in repository https://gitbox.apache.org/repos/asf/avro.git



The following commit(s) were added to refs/heads/branch-1.11 by this push:
     new 6dda3d738 docs: Change index.md to add a schema for data blocks (#2042)
6dda3d738 is described below

commit 6dda3d738a7890107236208bcae908db1cb83a7b
Author: dpcollins-google <[email protected]>
AuthorDate: Wed Mar 1 14:02:55 2023 -0500

    docs: Change index.md to add a schema for data blocks (#2042)
    
    Also make data file schemas valid json (no trailing commas)
---
 doc/content/en/docs/++version++/Specification/_index.md | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/doc/content/en/docs/++version++/Specification/_index.md 
b/doc/content/en/docs/++version++/Specification/_index.md
index c6716466d..df641e2db 100755
--- a/doc/content/en/docs/++version++/Specification/_index.md
+++ b/doc/content/en/docs/++version++/Specification/_index.md
@@ -472,7 +472,18 @@ A file data block consists of:
 * The serialized objects. If a codec is specified, this is compressed by that 
codec.
 * The file's 16-byte sync marker.
 
-Thus, each block's binary data can be efficiently extracted or skipped without 
deserializing the contents. The combination of block size, object counts, and 
sync markers enable detection of corrupt blocks and help ensure data integrity.
+A file data block is thus described by the following schema:
+```json
+{"type": "record", "name": "org.apache.avro.file.DataBlock",
+ "fields" : [
+   {"name": "count", "type": "long"},
+   {"name": "data", "type": "bytes"},
+   {"name": "sync", "type": {"type": "fixed", "name": "Sync", "size": 16}}
+  ]
+}
+```
+
+Each block's binary data can be efficiently extracted or skipped without 
deserializing the contents. The combination of block size, object counts, and 
sync markers enable detection of corrupt blocks and help ensure data integrity.
 
 ### Required Codecs

[avro] branch branch-1.11 updated: docs: Change index.md to add a schema for data blocks (#2042)

Reply via email to