This is an automated email from the ASF dual-hosted git repository.
fokko pushed a commit to branch production
in repository https://gitbox.apache.org/repos/asf/parquet-site.git
The following commit(s) were added to refs/heads/production by this push:
new e79b304 Fix typos (#46)
e79b304 is described below
commit e79b30489c6bd50f0829a5f2b87f4a26f5e4af05
Author: Andreas Deininger <[email protected]>
AuthorDate: Mon Mar 11 22:11:10 2024 +0100
Fix typos (#46)
* Fix typos
* Fix broken link, use https instead of http
---
content/en/docs/Contribution Guidelines/contributing.md | 2 +-
content/en/docs/Contribution Guidelines/releasing.md | 4 ++--
content/en/docs/File Format/Data Pages/compression.md | 6 +++---
content/en/docs/File Format/Data Pages/encodings.md | 4 ++--
content/en/docs/File Format/Data Pages/encryption.md | 2 +-
content/en/docs/File Format/bloomfilter.md | 8 ++++----
6 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/content/en/docs/Contribution Guidelines/contributing.md
b/content/en/docs/Contribution Guidelines/contributing.md
index 58b3bf5..b965556 100644
--- a/content/en/docs/Contribution Guidelines/contributing.md
+++ b/content/en/docs/Contribution Guidelines/contributing.md
@@ -23,7 +23,7 @@ If you’d like to report a bug but don’t have time to fix it,
you can still p
Committers
----------
-Merging a pull request requires being a comitter on the project.
+Merging a pull request requires being a committer on the project.
How to merge a Pull request (have an apache and github-apache remote setup):
diff --git a/content/en/docs/Contribution Guidelines/releasing.md
b/content/en/docs/Contribution Guidelines/releasing.md
index 7ea25ce..2a8c305 100644
--- a/content/en/docs/Contribution Guidelines/releasing.md
+++ b/content/en/docs/Contribution Guidelines/releasing.md
@@ -19,7 +19,7 @@ If you have problems, read the [publishing Maven artifacts
documentation](https:
### Release process
-Parquet uses the maven-release-plugin to tag a release and push binary
artifacts to staging in Nexus. Once maven completes the release, the offical
source tarball is built from the tag.
+Parquet uses the maven-release-plugin to tag a release and push binary
artifacts to staging in Nexus. Once maven completes the release, the official
source tarball is built from the tag.
Before you start the release process:
@@ -153,7 +153,7 @@ Then add and commit the release artifacts:
#### 4\. Update parquet.apache.org
-Update the downloads page on parquet.apache.org. Instructions for updating the
site are on the [contribution
page](http://parquet.apache.org/docs/contribution-guidelines/contributing/).
+Update the downloads page on parquet.apache.org. Instructions for updating the
site are on the [contribution
page](https://parquet.apache.org/docs/contribution-guidelines/contributing/).
#### 5\. Send an ANNOUNCE e-mail to
[[email protected]](mailto:[email protected]) and the dev list
diff --git a/content/en/docs/File Format/Data Pages/compression.md
b/content/en/docs/File Format/Data Pages/compression.md
index 3217612..7392292 100644
--- a/content/en/docs/File Format/Data Pages/compression.md
+++ b/content/en/docs/File Format/Data Pages/compression.md
@@ -47,7 +47,7 @@ that writers refrain from creating such pages by default for
better interoperabi
### LZO
A codec based on or interoperable with the
-[LZO compression library](http://www.oberhumer.com/opensource/lzo/).
+[LZO compression library](https://www.oberhumer.com/opensource/lzo/).
### BROTLI
@@ -73,11 +73,11 @@ switch to the newer, interoperable `LZ4_RAW` codec.
A codec based on the Zstandard format defined by
[RFC 8478](https://tools.ietf.org/html/rfc8478). If any ambiguity arises
when implementing this format, the implementation provided by the
-[ZStandard compression library](https://facebook.github.io/zstd/)
+[Zstandard compression library](https://facebook.github.io/zstd/)
is authoritative.
### LZ4_RAW
A codec based on the [LZ4 block
format](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
If any ambiguity arises when implementing this format, the implementation
-provided by the [LZ4 compression library](http://www.lz4.org/) is
authoritative.
+provided by the [LZ4 compression library](https://www.lz4.org/) is
authoritative.
diff --git a/content/en/docs/File Format/Data Pages/encodings.md
b/content/en/docs/File Format/Data Pages/encodings.md
index 3ff8d05..ea27d46 100644
--- a/content/en/docs/File Format/Data Pages/encodings.md
+++ b/content/en/docs/File Format/Data Pages/encodings.md
@@ -158,7 +158,7 @@ repetition and definition levels.
Supported Types: INT32, INT64
This encoding is adapted from the Binary packing described in
-["Decoding billions of integers per second through
vectorization"](http://arxiv.org/pdf/1209.2137v5.pdf)
+["Decoding billions of integers per second through
vectorization"](https://arxiv.org/pdf/1209.2137v5.pdf)
by D. Lemire and L. Boytsov.
In delta encoding we make use of variable length integers for storing various
@@ -189,7 +189,7 @@ Each block contains
positive integers for bit packing)
* the bitwidth of each block is stored as a byte
* each miniblock is a list of bit packed ints according to the bit width
- stored at the begining of the block
+ stored at the beginning of the block
To encode a block, we will:
diff --git a/content/en/docs/File Format/Data Pages/encryption.md
b/content/en/docs/File Format/Data Pages/encryption.md
index 1f736c5..62f803e 100644
--- a/content/en/docs/File Format/Data Pages/encryption.md
+++ b/content/en/docs/File Format/Data Pages/encryption.md
@@ -189,7 +189,7 @@ data set (table). This string is optionally passed by a
writer upon file creatio
the AAD prefix is stored in an `aad_prefix` field in the file, and is made
available to the readers.
This field is not encrypted. If a user is concerned about keeping the file
identity inside the file,
the writer code can explicitly request Parquet not to store the AAD prefix.
Then the aad_prefix field
-will be empty; AAD prefixes must be fully managed by the caller code and
supplied explictly to Parquet
+will be empty; AAD prefixes must be fully managed by the caller code and
supplied explicitly to Parquet
readers for each file.
The protection against swapping full files is optional. It is not enabled by
default because
diff --git a/content/en/docs/File Format/bloomfilter.md b/content/en/docs/File
Format/bloomfilter.md
index e4203b4..6fe0aaf 100644
--- a/content/en/docs/File Format/bloomfilter.md
+++ b/content/en/docs/File Format/bloomfilter.md
@@ -154,7 +154,7 @@ unsigned int32 i = (h_top_bits * z_as_64_bit) >> 32;
```
The first line extracts the most significant 32 bits from `h` and
-assignes them to a 64-bit unsigned integer. The second line is
+assigns them to a 64-bit unsigned integer. The second line is
simpler: it just sets an unsigned 64-bit value to the same value as
the 32-bit unsigned value `z`. The purpose of having both `h_top_bits`
and `z_as_64_bit` be 64-bit values is so that their product is a
@@ -205,7 +205,7 @@ boolean filter_check(SBBF filter, unsigned int64 x) {
The use of blocks is from Putze et al.'s [Cache-, Hash- and
Space-Efficient Bloom
-filters](http://algo2.iti.kit.edu/documents/cacheefficientbloomfilters-jea.pdf)
+filters](https://www.cs.amherst.edu/~ccmcgeoch/cs34/papers/cacheefficientbloomfilters-jea.pdf)
To use an SBBF for values of arbitrary Parquet types, we apply a hash
function to that value - at the time of writing,
@@ -217,14 +217,14 @@ with a seed of 0 and [following the specification version
The `check` operation in SBBFs can return `true` for an argument that
was never inserted into the SBBF. These are called "false
-positives". The "false positive probabilty" is the probability that
+positives". The "false positive probability" is the probability that
any given hash value that was never `insert`ed into the SBBF will
cause `check` to return `true` (a false positive). There is not a
simple closed-form calculation of this probability, but here is an
example:
A filter that uses 1024 blocks and has had 26,214 hash values
-`insert`ed will have a false positive probabilty of around 1.26%. Each
+`insert`ed will have a false positive probability of around 1.26%. Each
of those 1024 blocks occupies 256 bits of space, so the total space
usage is 262,144. That means that the ratio of bits of space to hash
values is 10-to-1. Adding more hash values increases the denominator