This is an automated email from the ASF dual-hosted git repository.
wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-testing.git
The following commit(s) were added to refs/heads/master by this push:
new aafd3fc add test file for page index filter. (#25)
aafd3fc is described below
commit aafd3fc9df431c2625a514fb46626e5614f1d199
Author: Yang Jiang <[email protected]>
AuthorDate: Mon Jul 4 22:32:23 2022 +0800
add test file for page index filter. (#25)
* add test file for page index filter.
* add link
---
data/README.md | 22 ++++++++++++----------
data/alltypes_tiny_pages.parquet | Bin 0 -> 454233 bytes
data/alltypes_tiny_pages_plain.parquet | Bin 0 -> 811756 bytes
3 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/data/README.md b/data/README.md
index b1227d4..970c37b 100644
--- a/data/README.md
+++ b/data/README.md
@@ -19,16 +19,18 @@
# Test data files for Parquet compatibility and regression testing
-| File | Description
|
-|----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| delta_byte_array.parquet | string columns with
DELTA_BYTE_ARRAY encoding. See [delta_byte_array.md](delta_byte_array.md) for
details. |
-| delta_length_byte_array.parquet | string columns with
DELTA_LENGTH_BYTE_ARRAY encoding.
|
-| delta_binary_packed.parquet | INT32 and INT64 columns with
DELTA_BINARY_PACKED encoding. See
[delta_binary_packed.md](delta_binary_packed.md) for details.
|
-| delta_encoding_required_column.parquet | required INT32 and STRING
columns with delta encoding. See
[delta_encoding_required_column.md](delta_encoding_required_column.md) for
details. |
-| delta_encoding_optional_column.parquet | optional INT64 and STRING
columns with delta encoding. See
[delta_encoding_optional_column.md](delta_encoding_optional_column.md) for
details. |
-| nested_structs.rust.parquet | Used to test that the Rust
Arrow reader can lookup the correct field from a nested struct. See
[ARROW-11452](https://issues.apache.org/jira/browse/ARROW-11452) |
-| data_index_bloom_encoding_stats.parquet | optional STRING column. Contains
optional metadata: bloom filters, column index, offset index and encoding
stats. |
-|null_list.parquet | an empty list. Generated from
this json `{"emptylist":[]}` and for the purposes of testing correct read/write
behaviour of this base case. |
+| File | Description
|
+|----------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| delta_byte_array.parquet | string columns with
DELTA_BYTE_ARRAY encoding. See [delta_byte_array.md](delta_byte_array.md) for
details. |
+| delta_length_byte_array.parquet | string columns with
DELTA_LENGTH_BYTE_ARRAY encoding.
|
+| delta_binary_packed.parquet | INT32 and INT64 columns with
DELTA_BINARY_PACKED encoding. See
[delta_binary_packed.md](delta_binary_packed.md) for details.
|
+| delta_encoding_required_column.parquet | required INT32 and STRING
columns with delta encoding. See
[delta_encoding_required_column.md](delta_encoding_required_column.md) for
details. |
+| delta_encoding_optional_column.parquet | optional INT64 and STRING
columns with delta encoding. See
[delta_encoding_optional_column.md](delta_encoding_optional_column.md) for
details. |
+| nested_structs.rust.parquet | Used to test that the Rust
Arrow reader can lookup the correct field from a nested struct. See
[ARROW-11452](https://issues.apache.org/jira/browse/ARROW-11452) |
+| data_index_bloom_encoding_stats.parquet | optional STRING column. Contains
optional metadata: bloom filters, column index, offset index and encoding
stats. |
+| null_list.parquet | an empty list. Generated from this
json `{"emptylist":[]}` and for the purposes of testing correct read/write
behaviour of this base case. |
+| alltypes_tiny_pages.parquet | small page sizes with dictionary
encoding with page index from
[impala](https://github.com/apache/impala/tree/master/testdata/data/alltypes_tiny_pages.parquet).
|
+| alltypes_tiny_pages_plain.parquet | small page sizes with plain
encoding with page index
[impala](https://github.com/apache/impala/tree/master/testdata/data/alltypes_tiny_pages.parquet).
|
TODO: Document what each file is in the table above.
diff --git a/data/alltypes_tiny_pages.parquet b/data/alltypes_tiny_pages.parquet
new file mode 100644
index 0000000..90019d1
Binary files /dev/null and b/data/alltypes_tiny_pages.parquet differ
diff --git a/data/alltypes_tiny_pages_plain.parquet
b/data/alltypes_tiny_pages_plain.parquet
new file mode 100644
index 0000000..68d4dcb
Binary files /dev/null and b/data/alltypes_tiny_pages_plain.parquet differ