This is an automated email from the ASF dual-hosted git repository.
bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 886effdb510 [DOCS] Add Record Index Metadata partition documentation
and other schema details (#9705)
886effdb510 is described below
commit 886effdb510a8b08d8bc19af136263c03d6bd851
Author: Lokesh Jain <[email protected]>
AuthorDate: Thu Sep 21 06:41:12 2023 +0530
[DOCS] Add Record Index Metadata partition documentation and other schema
details (#9705)
* [DOCS] Add Record Index Metadata partition documentationa and other
schema details
* Add table
* Address review comments
* Add formatting fixes
---------
Co-authored-by: Bhavani Sudha Saktheeswaran
<[email protected]>
---
website/src/pages/tech-specs.md | 178 ++++++++++++++++++++++++++--------------
1 file changed, 116 insertions(+), 62 deletions(-)
diff --git a/website/src/pages/tech-specs.md b/website/src/pages/tech-specs.md
index 56155088846..fb3b67e63ab 100644
--- a/website/src/pages/tech-specs.md
+++ b/website/src/pages/tech-specs.md
@@ -58,10 +58,10 @@ Broadly, there can be two types of data files
1. **Base files** - Files that contain a set of records in columnar file
formats like Apache Parquet/Orc or indexed formats like HFile format.
2. **log files** - Log files contain inserts, updates, deletes issued against
a base file, encoded as a series of blocks. More on this
[below](#log-file-format).
-| Table Type | Trade-off
|
-|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Copy-on-Write (CoW) | Data is stored entirely in base files, optimized for
read performance and ideal for slow changing datasets
|
-| Merge-on-read (MoR) | Data is stored in a combination of base and log files,
optimized to [balance the write and read
performance](##balancing-write-and-query-performance) and ideal for frequently
changing datasets |
+| Table Type | Trade-off
|
+|:---------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Copy-on-Write (CoW) | Data is stored entirely in base files, optimized for
read performance and ideal for slow changing datasets
|
+| Merge-on-read (MoR) | Data is stored in a combination of base and log
files, optimized to [balance the write and read
performance](##balancing-write-and-query-performance) and ideal for frequently
changing datasets |
### Data Model
Hudi's data model is designed like an update-able database like a key-value
store. Within each partition, data is organized into key-value model, where
every record is uniquely identified with a record key.
@@ -69,24 +69,24 @@ Hudi's data model is designed like an update-able database
like a key-value stor
#### User fields
To write a record into a Hudi table, each record must specify the following
user fields.
-| User fields | Description
[...]
-| ---------------------------
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
-| Partitioning key [Optional] | Value of this field defines the directory
hierarchy within the table base path. This essentially provides an hierarchy
isolation for managing data and related metadata
[...]
-| Record key(s) | Record keys uniquely identify a record within
each partition if partitioning is enabled
[...]
-| Ordering field(s) | Hudi guarantees the uniqueness constraint of
record key and the conflict resolution configuration manages strategies on how
to disambiguate when multiple records with the same keys are to be merged into
the table. The resolution logic can be based on an ordering field or can be
custom, specific to the table. To ensure consistent behaviour dealing with
duplicate records, the resolution logic should be commutative, associative and
idempotent. This is also re [...]
+| User fields | Description
[...]
+|:-----------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| Partitioning key [Optional] | Value of this field defines the directory
hierarchy within the table base path. This essentially provides an hierarchy
isolation for managing data and related metadata
[...]
+| Record key(s) | Record keys uniquely identify a record within
each partition if partitioning is enabled
[...]
+| Ordering field(s) | Hudi guarantees the uniqueness constraint of
record key and the conflict resolution configuration manages strategies on how
to disambiguate when multiple records with the same keys are to be merged into
the table. The resolution logic can be based on an ordering field or can be
custom, specific to the table. To ensure consistent behaviour dealing with
duplicate records, the resolution logic should be commutative, associative and
idempotent. This is also r [...]
#### Meta fields
In addition to the fields specified by the table's schema, the following meta
fields are added to each record, to unlock incremental processing and ease of
debugging. These meta fields are part of the table schema and
stored with the actual record to avoid re-computation.
-| Hudi meta-fields | Description
|
-|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| \_hoodie\_commit\_time | This field contains the commit timestamp in the
[timeline](#transaction-log-timeline) that created this record. This enables
granular, record-level history tracking on the table, much like database
change-data-capture. |
-| \_hoodie\_commit\_seqno | This field contains a unique sequence number for
each record within each transaction. This serves much like offsets in Apache
Kafka topics, to enable generating streams out of tables.
|
-| \_hoodie\_record\_key | Unique record key identifying the record within
the partition. Key is materialized to avoid changes to key field(s) resulting
in violating unique constraints maintained within a table.
|
-| \_hoodie\_partition\_path | Partition path under which the record is
organized into.
|
-| \_hoodie\_file\_name | The data file name this record belongs to.
|
+| Hudi meta-fields | Description
|
+|:--------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| \_hoodie\_commit\_time | This field contains the commit timestamp in the
[timeline](#transaction-log-timeline) that created this record. This enables
granular, record-level history tracking on the table, much like database
change-data-capture. |
+| \_hoodie\_commit\_seqno | This field contains a unique sequence number for
each record within each transaction. This serves much like offsets in Apache
Kafka topics, to enable generating streams out of tables.
|
+| \_hoodie\_record\_key | Unique record key identifying the record within
the partition. Key is materialized to avoid changes to key field(s) resulting
in violating unique constraints maintained within a table.
|
+| \_hoodie\_partition\_path | Partition path under which the record is
organized into.
|
+| \_hoodie\_file\_name | The data file name this record belongs to.
|
Within a given file, all records share the same values for
`_hoodie_partition_path` and `_hoodie_file_name`, thus easily compressed away
without any overheads with columnar file formats. The other fields can also be
optional for writers
depending on whether protection against key field changes or incremental
processing is desired. More on how to populate these fields in the sections
below.
@@ -111,17 +111,17 @@ Monotonically increasing value to denote strict ordering
of actions in the timel
**Action type:**
Type of action. The following are the actions on the Hudi timeline.
-| Action type | Description
|
-| -------------
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| commit | Commit denotes an **atomic write (inserts, updates and
deletes)** of records in a table. A commit in Hudi is an atomic way of updating
data, metadata and indexes. The guarantee is that all or none the changes
within a commit will be visible to the readers |
-| deltacommit | Special version of `commit` which is applicable only on a
Merge-on-Read storage engine. The writes are accumulated and batched to improve
write performance
|
-| rollback | Rollback denotes that the changes made by the corresponding
commit/delta commit were unsuccessful & hence rolled back, removing any partial
files produced during such a write
|
-| savepoint | Savepoint is a special marker to ensure a particular commit
is not automatically cleaned. It helps restore the table to a point on the
timeline, in case of disaster/data recovery scenarios
|
-| restore | Restore denotes that the table was restored to a particular
savepoint.
|
-| clean | Management activity that cleans up versions of data files
that no longer will be accessed
|
-| compaction | Management activity to optimize the storage for query
performance. This action applies the batched up updates from `deltacommit` and
re-optimizes data files for query performance
|
-| replacecommit | Management activity to replace a set of data files
atomically with another. It can be used to cluster the data for better query
performance. This action is different from a `commit` in that the table state
before and after are logically equivalent |
-| indexing | Management activity to update the index with the data. This
action does not change data, only updates the index aynchronously to data
changes
|
+| Action type | Description
|
+|:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| commit | Commit denotes an **atomic write (inserts, updates and
deletes)** of records in a table. A commit in Hudi is an atomic way of updating
data, metadata and indexes. The guarantee is that all or none the changes
within a commit will be visible to the readers |
+| deltacommit | Special version of `commit` which is applicable only on a
Merge-on-Read storage engine. The writes are accumulated and batched to improve
write performance
|
+| rollback | Rollback denotes that the changes made by the corresponding
commit/delta commit were unsuccessful & hence rolled back, removing any partial
files produced during such a write
|
+| savepoint | Savepoint is a special marker to ensure a particular commit
is not automatically cleaned. It helps restore the table to a point on the
timeline, in case of disaster/data recovery scenarios
|
+| restore | Restore denotes that the table was restored to a particular
savepoint.
|
+| clean | Management activity that cleans up versions of data files
that no longer will be accessed
|
+| compaction | Management activity to optimize the storage for query
performance. This action applies the batched up updates from `deltacommit` and
re-optimizes data files for query performance
|
+| replacecommit | Management activity to replace a set of data files
atomically with another. It can be used to cluster the data for better query
performance. This action is different from a `commit` in that the table state
before and after are logically equivalent |
+| indexing | Management activity to update the index with the data. This
action does not change data, only updates the index aynchronously to data
changes
|
**Action state:**
Denotes the state transition identifier (requested -\> inflight -\> completed)
@@ -152,13 +152,65 @@ By reconciling all the actions in the timeline, the state
of the Hudi table can
## Metadata
-Hudi automatically extracts the physical data statistics and stores the
metadata along with the data to improve write and query performance. Hudi
Metadata is an internally-managed table which organizes the table metadata
under the base path *.hoodie/metadata.* The metadata is in itself a Hudi table,
organized with the Hudi merge-on-read storage format. Every record stored in
the metadata table is a Hudi record and hence has partitioning key and record
key specified. Following are the met [...]
+Hudi automatically extracts the physical data statistics and stores the
metadata along with the data to improve write and query performance. Hudi
Metadata is an internally-managed table which organizes the table metadata
under the base path *.hoodie/metadata.* The metadata is in itself a Hudi table,
organized with the Hudi merge-on-read storage format. Every record stored in
the metadata table is a Hudi record and hence has partitioning key and record
key specified.
+
+Apache Hudi platform employs HFile format, to store metadata and indexes, to
ensure high performance, though
+different implementations are free to choose their own. Following are the
metadata table partitions :-
+
+- **files** - Partition path to file name index. Key for the Hudi record is
the partition path and the
+actual record is a map of file name to an instance of
[HoodieMetadataFileInfo][15] (Refer the schema below).
+The files index can be used to do file listing and do filter based pruning of
the scanset during query.
+
+| Schema | Field Name | Data Type | Description
|
+|:------------------------|:-------------|:-----------|:----------------------------------|
+| HoodieMetadataFileInfo | `size` | long | size of the file
|
+| | `isDeleted` | boolean | whether file has been
deleted |
+
+- **bloom\_filters** - Bloom filter index to help map a record key to the
actual file. The Hudi key is
+`str_concat(hash(partition name), hash(file name))` and the actual payload is
an instance of
+[HudiMetadataBloomFilter][16] (Refer the schema below). Bloom filter is used
to accelerate
+'presence checks' validating whether particular record is present in the file,
which is used during merging,
+hash-based joins, point-lookup queries, etc.
+
+| Schema | Field Name | Data Type | Description
|
+|:--------------------------|:---------------|:-----------|:-----------------------------------------------------|
+| HudiMetadataBloomFilter | `size` | long | size of the file
|
+| | `type` | string | type code of the
bloom filter |
+| | `timestamp` | string | timestamp when the
bloom filter was created/updated |
+| | `bloomFilter` | bytes | the actual bloom
filter for the data file |
+| | `isDeleted` | boolean | whether the bloom
filter entry is valid |
+
+- **column\_stats** - contains statistics of columns for all the records in
the table. This enables fine
+grained file pruning for filters and join conditions in the query. The actual
payload is an instance of
+[HoodieMetadataColumnStats][17] (Refer the schema below).
+
+| Schema | Field Name | Data Type
| Description |
+|:----------------------------|:-------------------------|:-------------------------------------------|:----------------------------------------------|
+| HoodieMetadataColumnStats | `fileName` | string
| file name for which the column stat applies |
+| | `columnName` | string
| column name for which the column stat apples |
+| | `minValue` | [Wrapper type][19]
(based on data schema) | minimum value of the column in the file |
+| | `maxValue` | [Wrapper type][19]
(based on data schema) | maximum value of the column in the file |
+| | `valueCount` | long
| total count of values |
+| | `nullCount` | long
| total count of null values |
+| | `totalSize` | long
| total storage size on disk |
+| | `totalUncompressedSize` | long
| total uncompressed storage size on disk |
+| | `isDeleted` | boolean
| whether the column stat entry is valid |
+
+- **record\_index** - contains information about record keys and their
location in the dataset. This improves
+performance of updates since it provides file locations for the updated
records and also enables fine grained
+file pruning for filters and join conditions in the query. The payload is an
instance of
+[HoodieRecordIndexInfo][18] (Refer the schema below).
+
+| Schema | Field Name | Data Type | Description
|
+|:------------------------|:---------------------|:-----------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| HoodieRecordIndexInfo | `partitionName` | string | partition name
to which the record belongs
|
+| | `fileIdEncoding` | int | determines the
fields used to deduce file id. When the encoding is 0, file Id can be deduced
from fileIdLowBits, fileIdHighBits and fileIndex. When encoding is 1, file Id
is available in raw string format in fileId field |
+| | `fileId` | string | file id in raw
string format is available when encoding is set to 1
|
+| | `fileIdHighBits` | long | file Id can be
deduced as {UUID}-{fileIndex} when encoding is set to 0. fileIdHighBits and
fileIdLowBits form the UUID
|
+| | `fileIdLowBits` | long | file Id can be
deduced as {UUID}-{fileIndex} when encoding is set to 0. fileIdHighBits and
fileIdLowBits form the UUID
|
+| | `fileIndex` | int | file Id can be
deduced as {UUID}-{fileIndex} when encoding is set to 0. fileIdHighBits and
fileIdLowBits form the UUID
|
+| | `instantTime` | long | Epoch time in
millisecond representing the commit time at which record was added
|
-- **files** - Partition path to file name index. Key for the Hudi record is
the partition path and the actual record is a map of file name to an instance
of [HoodieMetadataFileInfo][15]. The files index can be used to do file listing
and do filter based pruning of the scanset during query
-- **bloom\_filters** - Bloom filter index to help map a record key to the
actual file. The Hudi key is `str_concat(hash(partition name), hash(file
name))` and the actual payload is an instance of [HudiMetadataBloomFilter][16].
Bloom filter is used to accelerate 'presence checks' validating whether
particular record is present in the file, which is used during merging,
hash-based joins, point-lookup queries, etc.
-- **column\_stats** - contains statistics of columns for all the records in
the table. This enables fine grained file pruning for filters and join
conditions in the query. The actual payload is an instance of
[HoodieMetadataColumnStats][17].
-
-Apache Hudi platform employs HFile format, to store metadata and indexes, to
ensure high performance, though different implementations are free to choose
their own.
## File Layout Hierarchy
@@ -199,19 +251,19 @@ Hudi Log format specification is as follows.
![hudi\_log\_format\_v2][image-1]
-| Section | \#Bytes | Description
|
-|------------------------| -------- |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
-| **magic** | 6 | 6 Characters '#HUDI#' stored as a byte
array. Sanity check for block corruption to assert start 6 bytes matches the
magic byte[].
|
-| **LogBlock length** | 8 | Length of the block excluding the magic.
|
-| **version** | 4 | Version of the Log file format,
monotonically increasing to support backwards compatibility
|
-| **type** | 4 | Represents the type of the log block. Id
of the type is serialized as an Integer.
|
-| **header length** | 8 | Length of the header section to follow
|
-| **header** | variable | Custom serialized map of header metadata
entries. 4 bytes of map size that denotes number of entries, then for each
entry 4 bytes of metadata type, followed by length/bytearray of variable length
utf-8 string. |
-| **content length** | 8 | Length of the actual content serialized
|
-| **content** | variable | The content contains the serialized
records in one of the supported file formats (Apache Avro, Apache Parquet or
Apache HFile)
|
-| **footer length** | 8 | Length of the footer section to follow
|
-| **footer** | variable | Similar to Header. Map of footer
metadata entries.
|
-| **total block length** | 8 | Total size of the block including the
magic bytes. This is used to determine if a block is corrupt by comparing to
the block size in the header. Each log block assumes that the block size will
be last data written in a block. Any data if written after is just ignored. |
+| Section | \#Bytes | Description
|
+|:------------------------|:----------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **magic** | 6 | 6 Characters '#HUDI#' stored as a byte
array. Sanity check for block corruption to assert start 6 bytes matches the
magic byte[].
|
+| **LogBlock length** | 8 | Length of the block excluding the
magic.
|
+| **version** | 4 | Version of the Log file format,
monotonically increasing to support backwards compatibility
|
+| **type** | 4 | Represents the type of the log block.
Id of the type is serialized as an Integer.
|
+| **header length** | 8 | Length of the header section to follow
|
+| **header** | variable | Custom serialized map of header
metadata entries. 4 bytes of map size that denotes number of entries, then for
each entry 4 bytes of metadata type, followed by length/bytearray of variable
length utf-8 string.
|
+| **content length** | 8 | Length of the actual content
serialized
|
+| **content** | variable | The content contains the serialized
records in one of the supported file formats (Apache Avro, Apache Parquet or
Apache HFile)
|
+| **footer length** | 8 | Length of the footer section to follow
|
+| **footer** | variable | Similar to Header. Map of footer
metadata entries.
|
+| **total block length** | 8 | Total size of the block including the
magic bytes. This is used to determine if a block is corrupt by comparing to
the block size in the header. Each log block assumes that the block size will
be last data written in a block. Any data if written after is just ignored. |
Metadata key mapping from Integer to actual metadata is as follows
@@ -233,11 +285,11 @@ Encodes a command to the log reader. The Command block
must be 0 byte content bl
![spec\_log\_format\_delete\_block][image-2]
-| Section | \#bytes | Description
|
-| -------------- | -------- |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
-| format version | 4 | version of the log file format
|
-| length | 8 | length of the deleted keys section to follow
|
-| deleted keys | variable | Tombstone of the record to encode a delete. The
following 3 fields are serialized using the KryoSerializer. **Record Key** -
Unique record key within the partition to deleted **Partition Path** -
Partition path of the record deleted **Ordering Value** - In a particular batch
of updates, the delete block is always written after the data
(Avro/HFile/Parquet) block. This field would preserve the ordering of deletes
and inserts within the same batch. |
+| Section | \#bytes | Description
|
+|:----------------|:----------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| format version | 4 | version of the log file format
|
+| length | 8 | length of the deleted keys section to follow
|
+| deleted keys | variable | Tombstone of the record to encode a delete.
The following 3 fields are serialized using the KryoSerializer. **Record Key**
- Unique record key within the partition to deleted **Partition Path** -
Partition path of the record deleted **Ordering Value** - In a particular batch
of updates, the delete block is always written after the data
(Avro/HFile/Parquet) block. This field would preserve the ordering of deletes
and inserts within the same batch. |
##### Corrupted Block (Id: 3)
@@ -249,12 +301,12 @@ Data block serializes the actual records written into the
log file
![spec\_log\_format\_avro\_block][image-3]
-| Section | \#bytes | Description
|
-| -------------- | -------- |
------------------------------------------------------------------- |
-| format version | 4 | version of the log file format
|
-| record count | 4 | total number of records in this block
|
-| record length | 8 | length of the record content to follow
|
-| record content | variable | Record represented as an Avro record serialized
using BinaryEncoder |
+| Section | \#bytes | Description
|
+|:----------------|:----------|:---------------------------------------------------------------------|
+| format version | 4 | version of the log file format
|
+| record count | 4 | total number of records in this block
|
+| record length | 8 | length of the record content to follow
|
+| record content | variable | Record represented as an Avro record
serialized using BinaryEncoder |
##### HFile Block (Id: 5)
@@ -307,10 +359,10 @@ A critical design choice for any table is to pick the
right trade-offs in the da
#### Table types
-| | Merge Efficiency
| Query Efficiency
[...]
-| -------------------
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
-| Copy on Write (COW) | **Tunable** <br />COW table type creates a new File
slice in the file-group for every batch of updates. Write amplification can be
quite high when the update is spread across multiple file groups. The cost
involved can be high over a time period especially on tables with low data
latency requirements. | **Optimal** <br />COW table types create whole
readable data files in open source columnar file formats on each merge batch,
there is minimal overhead per recor [...]
-| Merge on Read (MOR) | **Optimal** <br />MOR table type batches the updates
to the file slice in a separate optimized Log file, write amplification is
amortized over time when sufficient updates are batched. The merge cost
involved will be lower than COW since the churn on the records re-written for
every update is much lower. | **Tunable**<br />MOR Table type required record
level merging during query. Although there are techniques to make this merge as
efficient as possible, there is [...]
+| | Merge Efficiency
| Query Efficiency
[...]
+|:--------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| Copy on Write (COW) | **Tunable** <br />COW table type creates a new File
slice in the file-group for every batch of updates. Write amplification can be
quite high when the update is spread across multiple file groups. The cost
involved can be high over a time period especially on tables with low data
latency requirements. | **Optimal** <br />COW table types create whole
readable data files in open source columnar file formats on each merge batch,
there is minimal overhead per reco [...]
+| Merge on Read (MOR) | **Optimal** <br />MOR table type batches the updates
to the file slice in a separate optimized Log file, write amplification is
amortized over time when sufficient updates are batched. The merge cost
involved will be lower than COW since the churn on the records re-written for
every update is much lower. | **Tunable**<br />MOR Table type required record
level merging during query. Although there are techniques to make this merge as
efficient as possible, there is [...]
> Interesting observation on the MOR table format is that, by providing a
> special view of the table which only serves the base files in the file slice
> (read optimized query of MOR table), query can pick between query efficiency
> and data freshness dynamically during query time. Compaction frequency
> determines the data freshness of the read optimized view. With this, the MOR
> has all the levers required to balance the merge and query performance
> dynamically.
@@ -427,7 +479,9 @@ The efficiency of Optimistic concurrency is inversely
proportional to the possib
[15]:
https://github.com/apache/hudi/blob/master/hudi-common/src/main/avro/HoodieMetadata.avsc#L34
[16]:
https://github.com/apache/hudi/blob/master/hudi-common/src/main/avro/HoodieMetadata.avsc#L66
[17]:
https://github.com/apache/hudi/blob/master/hudi-common/src/main/avro/HoodieMetadata.avsc#L101
+[18]:
https://github.com/apache/hudi/blob/master/hudi-common/src/main/avro/HoodieMetadata.avsc#L369
+[19]:
https://github.com/apache/hudi/blob/master/hudi-common/src/main/avro/HoodieMetadata.avsc#L125
[image-1]: /assets/images/hudi_log_format_v2.png
[image-2]: /assets/images/spec/spec_log_format_delete_block.png
-[image-3]: /assets/images/spec/spec_log_format_avro_block.png
\ No newline at end of file
+[image-3]: /assets/images/spec/spec_log_format_avro_block.png