This is an automated email from the ASF dual-hosted git repository.
gangwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git
The following commit(s) were added to refs/heads/master by this push:
new 143b88a MINOR: Fix typo in PageIndex.md (#212)
143b88a is described below
commit 143b88a1cc483b8d32178842b7b55a7833ff76ea
Author: Hannankan <[email protected]>
AuthorDate: Wed Aug 30 13:51:31 2023 +0800
MINOR: Fix typo in PageIndex.md (#212)
---
PageIndex.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/PageIndex.md b/PageIndex.md
index 96f7a47..f4a8f64 100644
--- a/PageIndex.md
+++ b/PageIndex.md
@@ -34,17 +34,17 @@ data from disk.
## Goals
1. Make both range scans and point lookups I/O efficient by allowing direct
access to pages based on their min and max values. In particular:
-2. A single-row lookup in a row group based on the sort column of that row
group
+ * A single-row lookup in a row group based on the sort column of that row
group
will only read one data page per the retrieved column.
* Range scans on the sort column will only need to read the exact data
pages that contain relevant data.
* Make other selective scans I/O efficient: if we have a very selective
predicate on a non-sorting column, for the other retrieved columns we
should only need to access data pages that contain matching rows.
-3. No additional decoding effort for scans without selective predicates, e.g.,
+2. No additional decoding effort for scans without selective predicates, e.g.,
full-row group scans. If a reader determines that it does not need to read
the index data, it does not incur any overhead.
-4. Index pages for sorted columns use minimal storage by storing only the
+3. Index pages for sorted columns use minimal storage by storing only the
boundary elements between pages.
## Non-Goals