This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/parquet-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 315353f deploy: a506efc9d2d708e86709841311f3220d828fb4cf
315353f is described below
commit 315353fd64f3a11a512ee77d6a0411e7e597ebce
Author: alamb <[email protected]>
AuthorDate: Wed Nov 12 11:32:23 2025 +0000
deploy: a506efc9d2d708e86709841311f3220d828fb4cf
---
output/_print/docs/contribution-guidelines/index.html | 4 ++--
output/_print/docs/file-format/data-pages/index.html | 4 ++--
output/_print/docs/file-format/index.html | 4 ++--
output/_print/docs/index.html | 10 +++++-----
output/_print/docs/overview/index.html | 4 ++--
.../docs/contribution-guidelines/contributing/index.html | 8 ++++----
output/docs/file-format/data-pages/encodings/index.html | 16 ++++++++--------
output/docs/file-format/data-pages/encryption/index.html | 8 ++++----
output/docs/file-format/data-pages/index.xml | 2 +-
output/docs/overview/index.html | 4 ++--
output/index.html | 4 ++--
output/index.xml | 2 +-
output/sitemap.xml | 2 +-
13 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/output/_print/docs/contribution-guidelines/index.html
b/output/_print/docs/contribution-guidelines/index.html
index eedc504..b6eb43e 100644
--- a/output/_print/docs/contribution-guidelines/index.html
+++ b/output/_print/docs/contribution-guidelines/index.html
@@ -1,6 +1,6 @@
<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=canonical type=text/html href=/docs/contribution-guidelines/><link
rel=alternate type=application/rss+xml
href=/docs/contribution-guidelines/index.xml><meta name=robots
content="noindex, nofollow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon href=/favic [...]
<a href=# onclick="return print(),!1">Click here to print</a>.</p><p><a
href=/docs/contribution-guidelines/>Return to the regular view of this
page</a>.</p></div><h1 class=title>Developer Guide</h1><div class=lead>All
developer resources related to Parquet.</div><ul><li>1: <a
href=#pg-84c5df6519663fa8413f5c392f10dbd4>Sub-Projects</a></li><li>2: <a
href=#pg-0fc7677c5a8dcd5250334bbf678cb165>Building Parquet</a></li><li>3: <a
href=#pg-47cac26307c77b16f1b9e75c1e46efec>Contributing to Parquet [...]
-Java resources can be build using <code>mvn package</code>. The current stable
version should always be available from Maven Central.</p><p>C++ thrift
resources can be generated via make.</p><p>Thrift can be also code-genned into
any other thrift-supported language.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-47cac26307c77b16f1b9e75c1e46efec>3 -
Contributing to Parquet-Java</h1><div class=lead>How to contribute to
Parquet-Java</div><h2 id=pull-requests>Pull Re [...]
+Java resources can be build using <code>mvn package</code>. The current stable
version should always be available from Maven Central.</p><p>C++ thrift
resources can be generated via make.</p><p>Thrift can be also code-genned into
any other thrift-supported language.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-47cac26307c77b16f1b9e75c1e46efec>3 -
Contributing to Parquet-Java</h1><div class=lead>How to contribute to
Parquet-Java</div><h2 id=pull-requests>Pull Re [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @param c the current class
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @return the corresponding logger
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @deprecated will be removed in 2.0.0;
use org.slf4j.LoggerFactory instead.
@@ -10,7 +10,7 @@ Java resources can be build using <code>mvn package</code>.
The current stable v
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>return</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>new</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#000>Log</span><span style=color:#000;font-weight:700>(</span><span
style=color:#000>c</span><span style=color:#000;font-weight:700> [...]
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline></span><span
style=color:#000;font-weight:700>}</span><span
style=color:#f8f8f8;text-decoration:underline>
</span></span></span></code></pre></div><p>Checking for API violations can be
done by running <code>mvn verify -Dmaven.test.skip=true
japicmp:cmp</code>.</p><h3 id=tracking-issues-using-milestones>Tracking issues
using Milestones</h3><p>When a PR is raised that fixes a bug, or a feature that
you want to target a certain version, make sure to attach a <a
href=https://github.com/apache/parquet-java/milestones>milestone</a>. This way
other committers can track certain versions, and see what [...]
-</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>get fetch --all
+</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>git fetch --all
</span></span><span style=display:flex><span>git checkout parquet-1.14.x
</span></span><span style=display:flex><span>git reset --hard
github-apache/parquet-1.14.x
</span></span><span style=display:flex><span>git cherry-pick
<hash-from-the-commit>
diff --git a/output/_print/docs/file-format/data-pages/index.html
b/output/_print/docs/file-format/data-pages/index.html
index a61bdf3..03d354d 100644
--- a/output/_print/docs/file-format/data-pages/index.html
+++ b/output/_print/docs/file-format/data-pages/index.html
@@ -48,7 +48,7 @@ when implementing this format, the implementation provided by
the
is authoritative.</p><h3 id=lz4_raw>LZ4_RAW</h3><p>A codec based on the <a
href=https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md>LZ4 block
format</a>.
If any ambiguity arises when implementing this format, the implementation
provided by the <a href=https://www.lz4.org/>LZ4 compression library</a> is
authoritative.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-9aa971e751fdd370158d525ad337ef7a>2 -
Encodings</h1><p><a name=PLAIN></a></p><h3 id=plain-plain--0>Plain: (PLAIN =
0)</h3><p>Supported Types: all</p><p>This is the plain encoding that must be
supported for types. It is
-intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding can
not be used. It
+intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding
cannot be used. It
stores the data in the following format:</p><ul><li>BOOLEAN: <a
href=/docs/file-format/data-pages/encodings/#BITPACKED>Bit Packed</a>, LSB
first</li><li>INT32: 4 bytes little endian</li><li>INT64: 8 bytes little
endian</li><li>INT96: 12 bytes little endian (deprecated)</li><li>FLOAT: 4
bytes IEEE little endian</li><li>DOUBLE: 8 bytes IEEE little
endian</li><li>BYTE_ARRAY: length in 4 bytes little endian followed by the
bytes contained in the array</li><li>FIXED_LEN_BYTE_ARRAY: the bytes [...]
point types are encoded in IEEE.</p><p>For the byte array type, it encodes the
length as a 4 byte little
endian, followed by the bytes.</p><h3
id=dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8>Dictionary
Encoding (PLAIN_DICTIONARY = 2 and RLE_DICTIONARY = 8)</h3><p>The dictionary
encoding builds a dictionary of values encountered in a given column. The
@@ -430,7 +430,7 @@ structure with the
AES GCM algorithm - using a footer signing key, and an AAD constructed
according to the instructions
of the section 4.4. Only the nonce and GCM tag are stored in the file – as a
28-byte
fixed-length array, written right after the footer itself. The ciphertext is
not stored,
-because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>stru [...]
+because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>struct</ [...]
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>...</span>
</span></span><span style=display:flex><span> <span
style=color:#8f5902;font-style:italic>/**
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * Encryption algorithm. This field is
set only in encrypted files
diff --git a/output/_print/docs/file-format/index.html
b/output/_print/docs/file-format/index.html
index 5ecc1df..cdd9014 100644
--- a/output/_print/docs/file-format/index.html
+++ b/output/_print/docs/file-format/index.html
@@ -323,7 +323,7 @@ when implementing this format, the implementation provided
by the
is authoritative.</p><h3 id=lz4_raw>LZ4_RAW</h3><p>A codec based on the <a
href=https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md>LZ4 block
format</a>.
If any ambiguity arises when implementing this format, the implementation
provided by the <a href=https://www.lz4.org/>LZ4 compression library</a> is
authoritative.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-9aa971e751fdd370158d525ad337ef7a>7.2 -
Encodings</h1><p><a name=PLAIN></a></p><h3 id=plain-plain--0>Plain: (PLAIN =
0)</h3><p>Supported Types: all</p><p>This is the plain encoding that must be
supported for types. It is
-intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding can
not be used. It
+intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding
cannot be used. It
stores the data in the following format:</p><ul><li>BOOLEAN: <a
href=/docs/file-format/data-pages/encodings/#BITPACKED>Bit Packed</a>, LSB
first</li><li>INT32: 4 bytes little endian</li><li>INT64: 8 bytes little
endian</li><li>INT96: 12 bytes little endian (deprecated)</li><li>FLOAT: 4
bytes IEEE little endian</li><li>DOUBLE: 8 bytes IEEE little
endian</li><li>BYTE_ARRAY: length in 4 bytes little endian followed by the
bytes contained in the array</li><li>FIXED_LEN_BYTE_ARRAY: the bytes [...]
point types are encoded in IEEE.</p><p>For the byte array type, it encodes the
length as a 4 byte little
endian, followed by the bytes.</p><h3
id=dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8>Dictionary
Encoding (PLAIN_DICTIONARY = 2 and RLE_DICTIONARY = 8)</h3><p>The dictionary
encoding builds a dictionary of values encountered in a given column. The
@@ -705,7 +705,7 @@ structure with the
AES GCM algorithm - using a footer signing key, and an AAD constructed
according to the instructions
of the section 4.4. Only the nonce and GCM tag are stored in the file – as a
28-byte
fixed-length array, written right after the footer itself. The ciphertext is
not stored,
-because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>stru [...]
+because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>struct</ [...]
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>...</span>
</span></span><span style=display:flex><span> <span
style=color:#8f5902;font-style:italic>/**
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * Encryption algorithm. This field is
set only in encrypted files
diff --git a/output/_print/docs/index.html b/output/_print/docs/index.html
index e38e290..39e73ed 100644
--- a/output/_print/docs/index.html
+++ b/output/_print/docs/index.html
@@ -5,7 +5,7 @@ The specification for the Apache Parquet file format is hosted
in the parquet-fo
The specification for the Apache Parquet file format is hosted in the
parquet-format repository. The current implementation status of various
features can be found in the implementation status page."><link rel=preload
href=/scss/main.min.202e73a8e2d7abd80d0d2060167674ca75bb116f661294014d07d08d239ac55c.css
as=style integrity="sha256-IC5zqOLXq9gNDSBgFnZ0ynW7EW9mEpQBTQfQjSOaxVw="
crossorigin=anonymous><link
href=/scss/main.min.202e73a8e2d7abd80d0d2060167674ca75bb116f661294014d07d08d239ac55c
[...]
<a href=# onclick="return print(),!1">Click here to print</a>.</p><p><a
href=/docs/>Return to the regular view of this page</a>.</p></div><h1
class=title>Documentation</h1><ul><li>1: <a
href=#pg-6e17e09fffc1050f46600282def85180>Overview</a></li><ul><li>1.1: <a
href=#pg-2e5324b574579d3eb78ce816487d9e18>Motivation</a></li></ul><li>2: <a
href=#pg-fcf6186c3c17ed7f2e8577136109a520>Concepts</a></li><ul></ul><li>3: <a
href=#pg-52ee54aeff1ffc82031ec74e9a626eba>File Format</a></li><ul><li>3.1: <a
[...]
The current implementation status of various features can be found in the <a
href=/docs/file-format/implementationstatus/>implementation status</a>
page.</p></div></div><div class=td-content><h1
id=pg-6e17e09fffc1050f46600282def85180>1 - Overview</h1><div class=lead>All
about Parquet.</div><p>Apache Parquet is an open source, column-oriented data
file format designed for efficient data storage and retrieval.
-It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming language and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apach [...]
+It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming languages and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apac [...]
unchanged for describing this file format. The file format is
designed to work well on top of HDFS.</p></li><li><p><em>File</em>: A HDFS
file that must include the metadata for the file.
It does not need to actually contain the data.</p></li><li><p><em>Row
group</em>: A logical horizontal partitioning of the data into rows.
@@ -339,7 +339,7 @@ when implementing this format, the implementation provided
by the
is authoritative.</p><h3 id=lz4_raw>LZ4_RAW</h3><p>A codec based on the <a
href=https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md>LZ4 block
format</a>.
If any ambiguity arises when implementing this format, the implementation
provided by the <a href=https://www.lz4.org/>LZ4 compression library</a> is
authoritative.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-9aa971e751fdd370158d525ad337ef7a>3.7.2
- Encodings</h1><p><a name=PLAIN></a></p><h3 id=plain-plain--0>Plain: (PLAIN =
0)</h3><p>Supported Types: all</p><p>This is the plain encoding that must be
supported for types. It is
-intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding can
not be used. It
+intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding
cannot be used. It
stores the data in the following format:</p><ul><li>BOOLEAN: <a
href=/docs/file-format/data-pages/encodings/#BITPACKED>Bit Packed</a>, LSB
first</li><li>INT32: 4 bytes little endian</li><li>INT64: 8 bytes little
endian</li><li>INT96: 12 bytes little endian (deprecated)</li><li>FLOAT: 4
bytes IEEE little endian</li><li>DOUBLE: 8 bytes IEEE little
endian</li><li>BYTE_ARRAY: length in 4 bytes little endian followed by the
bytes contained in the array</li><li>FIXED_LEN_BYTE_ARRAY: the bytes [...]
point types are encoded in IEEE.</p><p>For the byte array type, it encodes the
length as a 4 byte little
endian, followed by the bytes.</p><h3
id=dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8>Dictionary
Encoding (PLAIN_DICTIONARY = 2 and RLE_DICTIONARY = 8)</h3><p>The dictionary
encoding builds a dictionary of values encountered in a given column. The
@@ -721,7 +721,7 @@ structure with the
AES GCM algorithm - using a footer signing key, and an AAD constructed
according to the instructions
of the section 4.4. Only the nonce and GCM tag are stored in the file – as a
28-byte
fixed-length array, written right after the footer itself. The ciphertext is
not stored,
-because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>stru [...]
+because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>struct</ [...]
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>...</span>
</span></span><span style=display:flex><span> <span
style=color:#8f5902;font-style:italic>/**
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * Encryption algorithm. This field is
set only in encrypted files
@@ -819,7 +819,7 @@ the scan.</p><p>The <code>min_values</code> and
<code>max_values</code> are calc
field in the <code>FileMetaData</code> struct of the footer.</p></div><div
class=td-content style=page-break-before:always><h1
id=pg-e0ad5830788d45de8b55e0c2b119349a>3.10 - Implementation status</h1><p>This
page summarizes the features supported by different Parquet
implementations.</p><p><em>Note</em>: If you find out of date information,
please help us improve the accuracy
of this page by opening an issue or submitting a pull request.</p><h3
id=legend>Legend</h3><p>The value in each box means:</p><ul><li>✅:
supported</li><li>❌: not supported</li><li>(R/W): partial reader/writer only
support</li><li>(blank): no data</li></ul><p>Implementations:</p><ul><li><a
href=https://github.com/apache/arrow/tree/main/cpp/src/parquet>arrow</a>
(C++)</li><li><a href=https://github.com/apache/parquet-java>parquet-java</a>
(Java)</li><li><a href=https://github.com/apache/ar [...]
-Java resources can be build using <code>mvn package</code>. The current stable
version should always be available from Maven Central.</p><p>C++ thrift
resources can be generated via make.</p><p>Thrift can be also code-genned into
any other thrift-supported language.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-47cac26307c77b16f1b9e75c1e46efec>4.3 -
Contributing to Parquet-Java</h1><div class=lead>How to contribute to
Parquet-Java</div><h2 id=pull-requests>Pull [...]
+Java resources can be build using <code>mvn package</code>. The current stable
version should always be available from Maven Central.</p><p>C++ thrift
resources can be generated via make.</p><p>Thrift can be also code-genned into
any other thrift-supported language.</p></div><div class=td-content
style=page-break-before:always><h1 id=pg-47cac26307c77b16f1b9e75c1e46efec>4.3 -
Contributing to Parquet-Java</h1><div class=lead>How to contribute to
Parquet-Java</div><h2 id=pull-requests>Pull [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @param c the current class
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @return the corresponding logger
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @deprecated will be removed in 2.0.0;
use org.slf4j.LoggerFactory instead.
@@ -829,7 +829,7 @@ Java resources can be build using <code>mvn package</code>.
The current stable v
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>return</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>new</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#000>Log</span><span style=color:#000;font-weight:700>(</span><span
style=color:#000>c</span><span style=color:#000;font-weight:700> [...]
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline></span><span
style=color:#000;font-weight:700>}</span><span
style=color:#f8f8f8;text-decoration:underline>
</span></span></span></code></pre></div><p>Checking for API violations can be
done by running <code>mvn verify -Dmaven.test.skip=true
japicmp:cmp</code>.</p><h3 id=tracking-issues-using-milestones>Tracking issues
using Milestones</h3><p>When a PR is raised that fixes a bug, or a feature that
you want to target a certain version, make sure to attach a <a
href=https://github.com/apache/parquet-java/milestones>milestone</a>. This way
other committers can track certain versions, and see what [...]
-</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>get fetch --all
+</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>git fetch --all
</span></span><span style=display:flex><span>git checkout parquet-1.14.x
</span></span><span style=display:flex><span>git reset --hard
github-apache/parquet-1.14.x
</span></span><span style=display:flex><span>git cherry-pick
<hash-from-the-commit>
diff --git a/output/_print/docs/overview/index.html
b/output/_print/docs/overview/index.html
index 0d87f58..2d8d584 100644
--- a/output/_print/docs/overview/index.html
+++ b/output/_print/docs/overview/index.html
@@ -1,5 +1,5 @@
-<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=canonical type=text/html href=/docs/overview/><link rel=alternate
type=application/rss+xml href=/docs/overview/index.xml><meta name=robots
content="noindex, nofollow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.p [...]
+<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=canonical type=text/html href=/docs/overview/><link rel=alternate
type=application/rss+xml href=/docs/overview/index.xml><meta name=robots
content="noindex, nofollow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.p [...]
<a href=# onclick="return print(),!1">Click here to print</a>.</p><p><a
href=/docs/overview/>Return to the regular view of this page</a>.</p></div><h1
class=title>Overview</h1><div class=lead>All about Parquet.</div><ul><li>1: <a
href=#pg-2e5324b574579d3eb78ce816487d9e18>Motivation</a></li></ul><div
class=content><p>Apache Parquet is an open source, column-oriented data file
format designed for efficient data storage and retrieval.
-It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming language and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apach [...]
+It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming languages and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apac [...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/docs/contribution-guidelines/contributing/index.html
b/output/docs/contribution-guidelines/contributing/index.html
index 3541b6d..55b894a 100644
--- a/output/docs/contribution-guidelines/contributing/index.html
+++ b/output/docs/contribution-guidelines/contributing/index.html
@@ -1,8 +1,8 @@
-<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=icon
type=image/png href=/favicons/favicon-16x16.png sizes=16x16><link rel=icon
type=image/png href=/favicon [...]
+<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=icon
type=image/png href=/favicons/favicon-16x16.png sizes=16x16><link rel=icon
type=image/png href=/favicon [...]
<a
href=https://github.com/apache/parquet-site/edit/production/content/en/docs/Contribution%20Guidelines/contributing.md
class="td-page-meta--edit td-page-meta__edit" target=_blank rel=noopener><i
class="fa-solid fa-pen-to-square fa-fw"></i> Edit this page</a>
<a
href="https://github.com/apache/parquet-site/new/production/content/en/docs/Contribution%20Guidelines?filename=change-me.md&value=---%0Atitle%3A+%22Long+Page+Title%22%0AlinkTitle%3A+%22Short+Nav+Title%22%0Aweight%3A+100%0Adescription%3A+%3E-%0A+++++Page+description+for+heading+and+indexes.%0A---%0A%0A%23%23+Heading%0A%0AEdit+this+template+to+create+your+new+page.%0A%0A%2A+Give+it+a+good+name%2C+ending+in+%60.md%60+-+e.g.+%60getting-started.md%60%0A%2A+Edit+the+%22front+matter%22+s
[...]
<a
href="https://github.com/apache/parquet-site/issues/new?title=Contributing%20to%20Parquet-Java"
class="td-page-meta--issue td-page-meta__issue" target=_blank rel=noopener><i
class="fa-solid fa-list-check fa-fw"></i> Create documentation issue</a>
-<a id=print href=/_print/docs/contribution-guidelines/><i class="fa-solid
fa-print fa-fw"></i> Print entire section</a></div><div class=td-toc><nav
id=TableOfContents><ul><li><a href=#pull-requests>Pull Requests</a></li><li><a
href=#committers>Committers</a><ul><li><a href=#merging-a-pull-request>Merging
a Pull Request</a></li><li><a href=#semantic-versioning>Semantic
versioning</a></li><li><a href=#tracking-issues-using-milestones>Tracking
issues using Milestones</a></li><li><a href=#ma [...]
+<a id=print href=/_print/docs/contribution-guidelines/><i class="fa-solid
fa-print fa-fw"></i> Print entire section</a></div><div class=td-toc><nav
id=TableOfContents><ul><li><a href=#pull-requests>Pull Requests</a></li><li><a
href=#committers>Committers</a><ul><li><a href=#merging-a-pull-request>Merging
a Pull Request</a></li><li><a href=#semantic-versioning>Semantic
versioning</a></li><li><a href=#tracking-issues-using-milestones>Tracking
issues using Milestones</a></li><li><a href=#ma [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @param c the current class
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @return the corresponding logger
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * @deprecated will be removed in 2.0.0;
use org.slf4j.LoggerFactory instead.
@@ -12,13 +12,13 @@
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>return</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#204a87;font-weight:700>new</span><span
style=color:#f8f8f8;text-decoration:underline> </span><span
style=color:#000>Log</span><span style=color:#000;font-weight:700>(</span><span
style=color:#000>c</span><span style=color:#000;font-weight:700> [...]
</span></span></span><span style=display:flex><span><span
style=color:#f8f8f8;text-decoration:underline></span><span
style=color:#000;font-weight:700>}</span><span
style=color:#f8f8f8;text-decoration:underline>
</span></span></span></code></pre></div><p>Checking for API violations can be
done by running <code>mvn verify -Dmaven.test.skip=true
japicmp:cmp</code>.</p><h3 id=tracking-issues-using-milestones>Tracking issues
using Milestones</h3><p>When a PR is raised that fixes a bug, or a feature that
you want to target a certain version, make sure to attach a <a
href=https://github.com/apache/parquet-java/milestones>milestone</a>. This way
other committers can track certain versions, and see what [...]
-</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>get fetch --all
+</span></span></code></pre></div><p>Now you can cherry-pick a PR to a previous
branch:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>git fetch --all
</span></span><span style=display:flex><span>git checkout parquet-1.14.x
</span></span><span style=display:flex><span>git reset --hard
github-apache/parquet-1.14.x
</span></span><span style=display:flex><span>git cherry-pick
<hash-from-the-commit>
</span></span><span style=display:flex><span>git push
github-apache/parquet-1.14.x
</span></span></code></pre></div><h2 id=website>Website</h2><h3
id=release-documentation>Release Documentation</h3><p>To create documentation
for a new release of <code>parquet-format</code> create a new
<releasenumber>.md file under <code>content/en/blog/parquet-format</code>.
Please see existing files in that directory as an example.</p><p>To create
documentation for a new release of <code>parquet-java</code> create a new
<releasenumber>.md file under <code>content/en/blog/parquet-java [...]
job in the <a
href=https://github.com/apache/parquet-site/blob/staging/.github/workflows/deploy.yml>deployment
workflow</a> will be run, populating the <code>asf-staging</code> branch on
this repo with the necessary files.</li></ol><p><strong>Do not directly edit
the <code>asf-staging</code> branch of this repo</strong></p><h4
id=production>Production</h4><p>To make a change to the <code>production</code>
version of the website:</p><ol><li>Make a PR against the
<code>production</code> br [...]
-job in the <a
href=https://github.com/apache/parquet-site/blob/production/.github/workflows/deploy.yml>deployment
workflow</a> will be run, populating the <code>asf-site</code> branch on this
repo with the necessary files.</li></ol><p><strong>Do not directly edit the
<code>asf-site</code> branch of this repo</strong></p><div
class=td-page-meta__lastmod>Last modified November 13, 2024: <a
data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/ee98419b8698b35a5c29d863b35990
[...]
+job in the <a
href=https://github.com/apache/parquet-site/blob/production/.github/workflows/deploy.yml>deployment
workflow</a> will be run, populating the <code>asf-site</code> branch on this
repo with the necessary files.</li></ol><p><strong>Do not directly edit the
<code>asf-site</code> branch of this repo</strong></p><div
class=td-page-meta__lastmod>Last modified November 12, 2025: <a
data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/a506efc9d2d708e86709841311f322
[...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/docs/file-format/data-pages/encodings/index.html
b/output/docs/file-format/data-pages/encodings/index.html
index 74550c2..d887213 100644
--- a/output/docs/file-format/data-pages/encodings/index.html
+++ b/output/docs/file-format/data-pages/encodings/index.html
@@ -1,21 +1,21 @@
<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=icon
type=image/png href=/favicons/favicon-16x16.png sizes=16x16><link rel=icon
type=image/png href=/favicon [...]
This is the plain encoding that must be supported for types. It is intended to
be the simplest encoding. Values are encoded back to back.
-The plain encoding is used whenever a more efficient encoding can not be used.
It stores the data in the following format:
+The plain encoding is used whenever a more efficient encoding cannot be used.
It stores the data in the following format:
BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><meta property="og:url" content="/docs/file-fo [...]
This is the plain encoding that must be supported for types. It is intended to
be the simplest encoding. Values are encoded back to back.
-The plain encoding is used whenever a more efficient encoding can not be used.
It stores the data in the following format:
-BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><meta property="og:locale" content="en"><meta [...]
+The plain encoding is used whenever a more efficient encoding cannot be used.
It stores the data in the following format:
+BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><meta property="og:locale" content="en"><meta [...]
This is the plain encoding that must be supported for types. It is intended to
be the simplest encoding. Values are encoded back to back.
-The plain encoding is used whenever a more efficient encoding can not be used.
It stores the data in the following format:
-BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><meta itemprop=dateModified content="2025-10-2 [...]
+The plain encoding is used whenever a more efficient encoding cannot be used.
It stores the data in the following format:
+BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><meta itemprop=dateModified content="2025-11-1 [...]
This is the plain encoding that must be supported for types. It is intended to
be the simplest encoding. Values are encoded back to back.
-The plain encoding is used whenever a more efficient encoding can not be used.
It stores the data in the following format:
+The plain encoding is used whenever a more efficient encoding cannot be used.
It stores the data in the following format:
BOOLEAN: Bit Packed, LSB first INT32: 4 bytes little endian INT64: 8 bytes
little endian INT96: 12 bytes little endian (deprecated) FLOAT: 4 bytes IEEE
little endian DOUBLE: 8 bytes IEEE little endian BYTE_ARRAY: length in 4 bytes
little endian followed by the bytes contained in the array
FIXED_LEN_BYTE_ARRAY: the bytes contained in the array For native types, this
outputs the data as little endian. Floating point types are encoded in
IEEE."><link rel=preload href=/scss/main.min.202e73a8 [...]
<a
href=https://github.com/apache/parquet-site/edit/production/content/en/docs/File%20Format/Data%20Pages/encodings.md
class="td-page-meta--edit td-page-meta__edit" target=_blank rel=noopener><i
class="fa-solid fa-pen-to-square fa-fw"></i> Edit this page</a>
<a
href="https://github.com/apache/parquet-site/new/production/content/en/docs/File%20Format/Data%20Pages?filename=change-me.md&value=---%0Atitle%3A+%22Long+Page+Title%22%0AlinkTitle%3A+%22Short+Nav+Title%22%0Aweight%3A+100%0Adescription%3A+%3E-%0A+++++Page+description+for+heading+and+indexes.%0A---%0A%0A%23%23+Heading%0A%0AEdit+this+template+to+create+your+new+page.%0A%0A%2A+Give+it+a+good+name%2C+ending+in+%60.md%60+-+e.g.+%60getting-started.md%60%0A%2A+Edit+the+%22front+matter%22+
[...]
<a href="https://github.com/apache/parquet-site/issues/new?title=Encodings"
class="td-page-meta--issue td-page-meta__issue" target=_blank rel=noopener><i
class="fa-solid fa-list-check fa-fw"></i> Create documentation issue</a>
<a id=print href=/_print/docs/file-format/data-pages/><i class="fa-solid
fa-print fa-fw"></i> Print entire section</a></div><div class=td-toc><nav
id=TableOfContents><ul><li><ul><li><a href=#plain-plain--0>Plain: (PLAIN =
0)</a></li><li><a
href=#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8>Dictionary
Encoding (PLAIN_DICTIONARY = 2 and RLE_DICTIONARY = 8)</a></li><li><a
href=#run-length-encoding--bit-packing-hybrid-rle--3>Run Length Encoding /
Bit-Packing Hybrid (RLE = 3) [...]
-intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding can
not be used. It
+intended to be the simplest encoding. Values are encoded back to
back.</p><p>The plain encoding is used whenever a more efficient encoding
cannot be used. It
stores the data in the following format:</p><ul><li>BOOLEAN: <a
href=/docs/file-format/data-pages/encodings/#BITPACKED>Bit Packed</a>, LSB
first</li><li>INT32: 4 bytes little endian</li><li>INT64: 8 bytes little
endian</li><li>INT96: 12 bytes little endian (deprecated)</li><li>FLOAT: 4
bytes IEEE little endian</li><li>DOUBLE: 8 bytes IEEE little
endian</li><li>BYTE_ARRAY: length in 4 bytes little endian followed by the
bytes contained in the array</li><li>FIXED_LEN_BYTE_ARRAY: the bytes [...]
point types are encoded in IEEE.</p><p>For the byte array type, it encodes the
length as a 4 byte little
endian, followed by the bytes.</p><h3
id=dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8>Dictionary
Encoding (PLAIN_DICTIONARY = 2 and RLE_DICTIONARY = 8)</h3><p>The dictionary
encoding builds a dictionary of values encountered in a given column. The
@@ -142,6 +142,6 @@ is allowed inside the data page.</p><p>Example:
Original data is three 32-bit floats and for simplicity we look at their raw
representation.</p><pre tabindex=0><code> Element 0 Element 1
Element 2
Bytes AA BB CC DD 00 11 22 33 A3 B4 C5 D6
</code></pre><p>After applying the transformation, the data has the following
representation:</p><pre tabindex=0><code>Bytes AA 00 A3 BB 11 B4 CC 22 C5 DD
33 D6
-</code></pre><div class=td-page-meta__lastmod>Last modified October 23, 2025:
<a data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/df3d5f946504aed3ba9a7b37d44069c1fcd070a4>Clarify
Bit-Packing doc example positioning (#29)
(df3d5f9)</a></div></div></main></div></div><footer class="td-footer row
d-print-none"><div class=container-fluid><div class="row mx-md-2"><div
class="td-footer__left col-6 col-sm-4 order-sm-1"><ul
class=td-footer__links-list><li class=td-footer__li [...]
+</code></pre><div class=td-page-meta__lastmod>Last modified November 12, 2025:
<a data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/a506efc9d2d708e86709841311f3220d828fb4cf>Minor:
Fix various typos on the site (#133)
(a506efc)</a></div></div></main></div></div><footer class="td-footer row
d-print-none"><div class=container-fluid><div class="row mx-md-2"><div
class="td-footer__left col-6 col-sm-4 order-sm-1"><ul
class=td-footer__links-list><li class=td-footer__links-i [...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/docs/file-format/data-pages/encryption/index.html
b/output/docs/file-format/data-pages/encryption/index.html
index 15cd068..fb71d34 100644
--- a/output/docs/file-format/data-pages/encryption/index.html
+++ b/output/docs/file-format/data-pages/encryption/index.html
@@ -1,7 +1,7 @@
<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=icon
type=image/png href=/favicons/favicon-16x16.png sizes=16x16><link rel=icon
type=image/png href=/favicon [...]
1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><meta property="og:url"
content="/docs/file-format/data-pages/encryption/"><meta property="og:site_n
[...]
-1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><meta property="og:locale" content="en"><meta
property="og:type" content="article"><meta property="a [...]
-1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><meta itemprop=dateModified
content="2024-03-11T22:11:10+01:00"><meta itemprop=wordCount content="39 [...]
+1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><meta property="og:locale" content="en"><meta
property="og:type" content="article"><meta property="a [...]
+1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><meta itemprop=dateModified
content="2025-11-12T06:31:27-05:00"><meta itemprop=wordCount content="39 [...]
1 Problem Statement Existing data protection solutions (such as flat
encryption of files, in-storage encryption, or use of an encrypting storage
client) can be applied to Parquet files, but have various security or
performance issues. An encryption mechanism, integrated in the Parquet format,
allows for an optimal combination of data security, processing speed and
encryption granularity."><link rel=preload
href=/scss/main.min.202e73a8e2d7abd80d0d2060167674ca75bb116f661294014d07d08d239ac5
[...]
<a
href=https://github.com/apache/parquet-site/edit/production/content/en/docs/File%20Format/Data%20Pages/encryption.md
class="td-page-meta--edit td-page-meta__edit" target=_blank rel=noopener><i
class="fa-solid fa-pen-to-square fa-fw"></i> Edit this page</a>
<a
href="https://github.com/apache/parquet-site/new/production/content/en/docs/File%20Format/Data%20Pages?filename=change-me.md&value=---%0Atitle%3A+%22Long+Page+Title%22%0AlinkTitle%3A+%22Short+Nav+Title%22%0Aweight%3A+100%0Adescription%3A+%3E-%0A+++++Page+description+for+heading+and+indexes.%0A---%0A%0A%23%23+Heading%0A%0AEdit+this+template+to+create+your+new+page.%0A%0A%2A+Give+it+a+good+name%2C+ending+in+%60.md%60+-+e.g.+%60getting-started.md%60%0A%2A+Edit+the+%22front+matter%22+
[...]
@@ -261,7 +261,7 @@ structure with the
AES GCM algorithm - using a footer signing key, and an AAD constructed
according to the instructions
of the section 4.4. Only the nonce and GCM tag are stored in the file – as a
28-byte
fixed-length array, written right after the footer itself. The ciphertext is
not stored,
-because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>stru [...]
+because it is not required for footer integrity verification by
readers.</p><table><thead><tr><th>nonce (12 bytes)</th><th>tag (16
bytes)</th></tr></thead><tbody></tbody></table><p>The plaintext footer mode
sets the following fields in the FileMetaData structure:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-c data-lang=c><span style=display:flex><span><span
style=color:#204a87;font-weight:700>struct</ [...]
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>...</span>
</span></span><span style=display:flex><span> <span
style=color:#8f5902;font-style:italic>/**
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic> * Encryption algorithm. This field is
set only in encrypted files
@@ -289,6 +289,6 @@ data - calculated by comparing the page encryption overhead
(nonce + tag + lengt
to the default page size (1 MB). This is a rough estimation, and can change
with the encryption
algorithm (no 16-byte tag in AES_GCM_CTR_V1) and with page configuration or
data encoding/compression.</p><p>The throughput overhead of Parquet modular
encryption depends on whether AES enciphering is
done in software or hardware. In both cases, performing encryption on full
pages (~1MB buffers)
-instead of on much smaller individual data values causes AES to work at its
maximal speed.</p><div class=td-page-meta__lastmod>Last modified March 11,
2024: <a data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/e79b30489c6bd50f0829a5f2b87f4a26f5e4af05>Fix
typos (#46) (e79b304)</a></div></div></main></div></div><footer
class="td-footer row d-print-none"><div class=container-fluid><div class="row
mx-md-2"><div class="td-footer__left col-6 col-sm-4 order-sm-1"><ul class= [...]
+instead of on much smaller individual data values causes AES to work at its
maximal speed.</p><div class=td-page-meta__lastmod>Last modified November 12,
2025: <a data-proofer-ignore
href=https://github.com/apache/parquet-site/commit/a506efc9d2d708e86709841311f3220d828fb4cf>Minor:
Fix various typos on the site (#133)
(a506efc)</a></div></div></main></div></div><footer class="td-footer row
d-print-none"><div class=container-fluid><div class="row mx-md-2"><div
class="td-footer__left col-6 [...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/docs/file-format/data-pages/index.xml
b/output/docs/file-format/data-pages/index.xml
index 4567f00..7db45c2 100644
--- a/output/docs/file-format/data-pages/index.xml
+++ b/output/docs/file-format/data-pages/index.xml
@@ -14,7 +14,7 @@ in the <code>PageHeader</code>
struct.</p></description></item
<p>Supported Types: all</p>
<p>This is the plain encoding that must be supported for types. It is
intended to be the simplest encoding. Values are encoded back to
back.</p>
-<p>The plain encoding is used whenever a more efficient encoding can not
be used. It
+<p>The plain encoding is used whenever a more efficient encoding cannot
be used. It
stores the data in the following format:</p>
<ul>
<li>BOOLEAN: <a
href="/docs/file-format/data-pages/encodings/#BITPACKED">Bit
Packed</a>, LSB first</li>
diff --git a/output/docs/overview/index.html b/output/docs/overview/index.html
index be167f3..cdca2ee 100644
--- a/output/docs/overview/index.html
+++ b/output/docs/overview/index.html
@@ -1,8 +1,8 @@
-<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=alternate type=text/html href=/_print/docs/overview/><link rel=alternate
type=application/rss+xml href=/docs/overview/index.xml><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x18 [...]
+<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=alternate type=text/html href=/_print/docs/overview/><link rel=alternate
type=application/rss+xml href=/docs/overview/index.xml><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x18 [...]
<a
href=https://github.com/apache/parquet-site/edit/production/content/en/docs/Overview/_index.md
class="td-page-meta--edit td-page-meta__edit" target=_blank rel=noopener><i
class="fa-solid fa-pen-to-square fa-fw"></i> Edit this page</a>
<a
href="https://github.com/apache/parquet-site/new/production/content/en/docs/Overview?filename=change-me.md&value=---%0Atitle%3A+%22Long+Page+Title%22%0AlinkTitle%3A+%22Short+Nav+Title%22%0Aweight%3A+100%0Adescription%3A+%3E-%0A+++++Page+description+for+heading+and+indexes.%0A---%0A%0A%23%23+Heading%0A%0AEdit+this+template+to+create+your+new+page.%0A%0A%2A+Give+it+a+good+name%2C+ending+in+%60.md%60+-+e.g.+%60getting-started.md%60%0A%2A+Edit+the+%22front+matter%22+section+at+the+top
[...]
<a href="https://github.com/apache/parquet-site/issues/new?title=Overview"
class="td-page-meta--issue td-page-meta__issue" target=_blank rel=noopener><i
class="fa-solid fa-list-check fa-fw"></i> Create documentation issue</a>
<a id=print href=/_print/docs/overview/><i class="fa-solid fa-print
fa-fw"></i> Print entire section</a></div><div class=td-toc><nav
id=TableOfContents><ul><li><ul><li><a
href=#parquet-format-specification>parquet-format
(Specification)</a></li><li><a href=#parquet-java>parquet-java</a></li><li><a
href=#other-clients--libraries--tools>Other Clients / Libraries /
Tools</a></li></ul></li></ul></nav></div></aside><main class="col-12 col-md-9
col-xl-8 ps-md-5" role=main><nav aria-label=bread [...]
-It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming language and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apach [...]
+It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming languages and
analytics tools.</p><h3 id=parquet-format-specification>parquet-format
(Specification)</h3><p>The <a
href=https://github.com/apache/parquet-format>parquet-format</a> repository
hosts the official specification of the Parquet file format, defining how data
is structured and stored. This specification, along with the <a
href=https://github.com/apac [...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/index.html b/output/index.html
index a5bbe40..e2ab1cf 100644
--- a/output/index.html
+++ b/output/index.html
@@ -1,5 +1,5 @@
-<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta name=generator content="Hugo 0.152.0"><meta
charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=alternate type=application/rss+xml href=/index.xml><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=ico [...]
+<!doctype html><html itemscope itemtype=http://schema.org/WebPage lang=en
class=no-js><head><meta name=generator content="Hugo 0.152.0"><meta
charset=utf-8><meta name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><link
rel=alternate type=application/rss+xml href=/index.xml><meta name=robots
content="index, follow"><link rel="shortcut icon"
href=/favicons/favicon.ico><link rel=apple-touch-icon
href=/favicons/apple-touch-icon-180x180.png sizes=180x180><link rel=ico [...]
</a><a class="btn btn-lg btn-secondary me-3 mb-4" href=/blog/>Releases <i
class="fa fa-arrow-circle-down ms-2"></i></a><p class="lead mt-5">Apache
Parquet is an open source, column-oriented data file format designed for
efficient data storage and retrieval.
-It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming language and
analytics tools.</p><p><a class="btn btn-link text-info" href=#td-block-1
aria-label="Read more"><i class="fa-solid fa-circle-chevron-down"
style=font-size:400%></i></a></p></div></div></div></div></section><div><a
id=td-block-1 class=td-offset-anchor></a></div><section class="row td-box
td-box--white td-box--height-auto"><div class=col><div class [...]
+It provides high performance compression and encoding schemes to handle
complex data in bulk and is supported in many programming languages and
analytics tools.</p><p><a class="btn btn-link text-info" href=#td-block-1
aria-label="Read more"><i class="fa-solid fa-circle-chevron-down"
style=font-size:400%></i></a></p></div></div></div></div></section><div><a
id=td-block-1 class=td-offset-anchor></a></div><section class="row td-box
td-box--white td-box--height-auto"><div class=col><div clas [...]
2025
<span class=td-footer__authors>Apache Parquet</span></span><span
class=td-footer__all_rights_reserved>All Rights Reserved</span><span
class=ms-2><a href=https://policies.google.com/privacy target=_blank
rel=noopener>Privacy Policy</a></span></div></div></div></footer></div><script
src=/js/main.min.5ea9bb4146d5e591d8c4d97ce08831bb4f2ea2dbf02ed68c5ecd03b7fa7f2bb4.js
integrity="sha256-Xqm7QUbV5ZHYxNl84Igxu08uotvwLtaMXs0Dt/p/K7Q="
crossorigin=anonymous></script><script defer src=/js/click-to [...]
\ No newline at end of file
diff --git a/output/index.xml b/output/index.xml
index 46ba3a6..041f356 100644
--- a/output/index.xml
+++ b/output/index.xml
@@ -14,7 +14,7 @@ in the <code>PageHeader</code>
struct.</p></description></item
<p>Supported Types: all</p>
<p>This is the plain encoding that must be supported for types. It is
intended to be the simplest encoding. Values are encoded back to
back.</p>
-<p>The plain encoding is used whenever a more efficient encoding can not
be used. It
+<p>The plain encoding is used whenever a more efficient encoding cannot
be used. It
stores the data in the following format:</p>
<ul>
<li>BOOLEAN: <a
href="/docs/file-format/data-pages/encodings/#BITPACKED">Bit
Packed</a>, LSB first</li>
diff --git a/output/sitemap.xml b/output/sitemap.xml
index 2066b69..9fdad35 100644
--- a/output/sitemap.xml
+++ b/output/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/docs/file-format/data-pages/compression/</loc><lastmod>2024-03-11T22:11:10+01:00</lastmod></url><url><loc>/docs/file-format/data-pages/encodings/</loc><lastmod>2025-10-23T22:58:27+02:00</lastmod></url><url><loc>/docs/file-format/data-pages/encryption/</loc><lastmod>2024-03-11T22:11:10+01:00</lastmod></url><url><loc>/docs/
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/docs/file-format/data-pages/compression/</loc><lastmod>2024-03-11T22:11:10+01:00</lastmod></url><url><loc>/docs/file-format/data-pages/encodings/</loc><lastmod>2025-11-12T06:31:27-05:00</lastmod></url><url><loc>/docs/file-format/data-pages/encryption/</loc><lastmod>2025-11-12T06:31:27-05:00</lastmod></url><url><loc>/docs/
[...]
\ No newline at end of file