This is an automated email from the ASF dual-hosted git repository.

gangwu pushed a commit to branch production
in repository https://gitbox.apache.org/repos/asf/parquet-site.git


The following commit(s) were added to refs/heads/production by this push:
     new c6af28b  Add implementation status of javascript hyparquet (#102)
c6af28b is described below

commit c6af28b034ca25ce5616537c1f9cea89f8c48a93
Author: Kenny Daniel <[email protected]>
AuthorDate: Tue Feb 25 17:35:29 2025 -0800

    Add implementation status of javascript hyparquet (#102)
---
 .../en/docs/File Format/implementationstatus.md    | 140 ++++++++++-----------
 1 file changed, 70 insertions(+), 70 deletions(-)

diff --git a/content/en/docs/File Format/implementationstatus.md 
b/content/en/docs/File Format/implementationstatus.md
index 4164c11..5f68613 100644
--- a/content/en/docs/File Format/implementationstatus.md       
+++ b/content/en/docs/File Format/implementationstatus.md       
@@ -22,91 +22,91 @@ Implementations:
 * `Go`: [parquet-go](https://github.com/apache/arrow-go/tree/main/parquet)
 * `Rust`: 
[parquet-rs](https://github.com/apache/arrow-rs/blob/main/parquet/README.md)
 * `cuDF`: [cudf](https://github.com/rapidsai/cudf)
-
+* `JavaScript`: [hyparquet](https://github.com/hyparam/hyparquet)
 
 
 ### Physical types
 
-| Data type                                 | C++   | Java  | Go    | Rust  | 
cuDF  |
-| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- |
-| BOOLEAN                                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| INT32                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| INT64                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| INT96 (1)                                 |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| FLOAT                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DOUBLE                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| BYTE_ARRAY                                |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| FIXED_LEN_BYTE_ARRAY                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
+| Data type                                 | C++   | Java  | Go    | Rust  | 
cuDF  | hyparquet |
+| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- | --------- |
+| BOOLEAN                                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| INT32                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| INT64                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| INT96 (1)                                 |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| FLOAT                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DOUBLE                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| BYTE_ARRAY                                |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| FIXED_LEN_BYTE_ARRAY                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
 
 * \(1) This type is deprecated, but as of 2024 it's common in currently 
produced parquet files
 
 
 ### Logical types
 
-| Data type                                 | C++   | Java  | Go    | Rust  | 
cuDF  |
-| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- |
-| STRING                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| ENUM                                      |  ❌   |  ✅   |       |  ✅(*)|  ❌  
 |
-| UUID                                      |  ❌   |  ✅   |       |  ✅(*)|  ❌  
 |
-| 8, 16, 32, 64 bit signed and unsigned INT |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DECIMAL (INT32)                           |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DECIMAL (INT64)                           |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DECIMAL (BYTE_ARRAY)                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DATE                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| TIME (INT32)                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| TIME (INT64)                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| TIMESTAMP (INT64)                         |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| INTERVAL                                  |  ✅   |  ✅(*)|       |  ✅   |  ❌  
 |
-| JSON                                      |  ✅   |  ✅(*)|       |  ✅(*)|  ❌  
 |
-| BSON                                      |  ❌   |  ✅(*)|       |  ✅(*)|  ❌  
 |
-| LIST                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| MAP                                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| UNKNOWN (always null)                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| FLOAT16                                   |  ✅   |  ✅(*)|       |  ✅   |  ✅  
 |
+| Data type                                 | C++   | Java  | Go    | Rust  | 
cuDF  | hyparquet |
+| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- | --------- |
+| STRING                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| ENUM                                      |  ❌   |  ✅   |       |  ✅(*)|  ❌  
 | (R)       |
+| UUID                                      |  ❌   |  ✅   |       |  ✅(*)|  ❌  
 | (R)       |
+| 8, 16, 32, 64 bit signed and unsigned INT |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DECIMAL (INT32)                           |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DECIMAL (INT64)                           |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DECIMAL (BYTE_ARRAY)                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DATE                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| TIME (INT32)                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| TIME (INT64)                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| TIMESTAMP (INT64)                         |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| INTERVAL                                  |  ✅   |  ✅(*)|       |  ✅   |  ❌  
 | (R)       |
+| JSON                                      |  ✅   |  ✅(*)|       |  ✅(*)|  ❌  
 | (R)       |
+| BSON                                      |  ❌   |  ✅(*)|       |  ✅(*)|  ❌  
 | (R)       |
+| LIST                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| MAP                                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| UNKNOWN (always null)                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| FLOAT16                                   |  ✅   |  ✅(*)|       |  ✅   |  ✅  
 | (R)       |
 
 (*): Only supported to use its annotated physical type
 
 ### Encodings
 
-| Encoding                                  | C++   | Java  | Go    | Rust  | 
cuDF  |
-| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- |
-| PLAIN                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| PLAIN_DICTIONARY                          |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| RLE_DICTIONARY                            |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| RLE                                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| BIT_PACKED (deprecated)                   |  ✅   |  ✅   |       |  ❌(*)|  
(R)  |
-| DELTA_BINARY_PACKED                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DELTA_LENGTH_BYTE_ARRAY                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| DELTA_BYTE_ARRAY                          |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| BYTE_STREAM_SPLIT                         |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
+| Encoding                                  | C++   | Java  | Go    | Rust  | 
cuDF  | hyparquet |
+| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- | --------- |
+| PLAIN                                     |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| PLAIN_DICTIONARY                          |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| RLE_DICTIONARY                            |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| RLE                                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| BIT_PACKED (deprecated)                   |  ✅   |  ✅   |       |  ❌(*)|  
(R)  | (R)       |
+| DELTA_BINARY_PACKED                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DELTA_LENGTH_BYTE_ARRAY                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| DELTA_BYTE_ARRAY                          |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| BYTE_STREAM_SPLIT                         |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
 
 (*): Partial read support, but only in the case of level data with a bitwidth 
of 0
 
 ### Compressions
 
-| Compression                               | C++   | Java  | Go    | Rust  | 
cuDF  |
-| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- |
-| UNCOMPRESSED                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| BROTLI                                    |  ✅   |  ✅   |       |  ✅   |  
(R)  |
-| GZIP                                      |  ✅   |  ✅   |       |  ✅   |  
(R)  |
-| LZ4 (deprecated)                          |  ✅   |  ❌   |       |  ✅   |  ❌  
 |
-| LZ4_RAW                                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| LZO                                       |  ❌   |  ❌   |       |  ❌   |  ❌  
 |
-| SNAPPY                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| ZSTD                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
+| Compression                               | C++   | Java  | Go    | Rust  | 
cuDF  | hyparquet |
+| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- | --------- |
+| UNCOMPRESSED                              |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| BROTLI                                    |  ✅   |  ✅   |       |  ✅   |  
(R)  | (R)       |
+| GZIP                                      |  ✅   |  ✅   |       |  ✅   |  
(R)  | (R)       |
+| LZ4 (deprecated)                          |  ✅   |  ❌   |       |  ✅   |  ❌  
 | (R)       |
+| LZ4_RAW                                   |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| LZO                                       |  ❌   |  ❌   |       |  ❌   |  ❌  
 | ❌        |
+| SNAPPY                                    |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| ZSTD                                      |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
 
 ### Other format level features
 
-|                                           | C++   | Java  | Go    | Rust  | 
cuDF  |
-| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- |
-| xxHash-based bloom filters                |  (R)  |  ✅   |       |  ✅   |  
(R)  |
-| Bloom filter length (1)                   |  (R)  |  ✅   |       |  ✅   |  
(R)  |
-| Statistics min_value, max_value           |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| Page index                                |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
-| Page CRC32 checksum                       |  ✅   |  ✅   |       |  ✅   |  ❌  
 |
-| Modular encryption                        |  ✅   |  ✅   |       |  ❌   |  ❌  
 |
-| Size statistics (2)                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 |
+|                                           | C++   | Java  | Go    | Rust  | 
cuDF  | hyparquet |
+| ----------------------------------------- | ----- | ----- | ----- | ----- | 
----- | --------- |
+| xxHash-based bloom filters                |  (R)  |  ✅   |       |  ✅   |  
(R)  |           |
+| Bloom filter length (1)                   |  (R)  |  ✅   |       |  ✅   |  
(R)  |           |
+| Statistics min_value, max_value           |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| Page index                                |  ✅   |  ✅   |       |  ✅   |  ✅  
 | (R)       |
+| Page CRC32 checksum                       |  ✅   |  ✅   |       |  ✅   |  ❌  
 | ❌        |
+| Modular encryption                        |  ✅   |  ✅   |       |  ❌   |  ❌  
 | ❌        |
+| Size statistics (2)                       |  ✅   |  ✅   |       |  ✅   |  ✅  
 |           |
 
 
 * \(1) In parquet.thrift: ColumnMetaData->bloom_filter_length
@@ -115,14 +115,14 @@ Implementations:
 
 ### High level data APIs for Parquet feature usage
 
-| Format                                       | C++   | Java  | Go    | Rust  
| cuDF  |
-| -------------------------------------------- | ----- | ----- | ----- | ----- 
| ----- |
-| External column data (1)                     |  ✅   |  ✅   |       |  ❌   |  
(W)  |
-| Row group "Sorting column" metadata (2)      |  ✅   |  ❌   |       |  ✅   |  
(W)  |
-| Row group pruning using statistics           |  ❌   |  ✅   |       |  ✅   |  
✅   |
-| Row group pruning using bloom filter         |  ❌   |  ✅   |       |  ✅   |  
✅   |
-| Reading select columns only                  |  ✅   |  ✅   |       |  ✅   |  
✅   |
-| Page pruning using statistics                |  ❌   |  ✅   |       |  ✅   |  
❌   |
+| Format                                       | C++   | Java  | Go    | Rust  
| cuDF  | hyparquet |
+| -------------------------------------------- | ----- | ----- | ----- | ----- 
| ----- | --------- |
+| External column data (1)                     |  ✅   |  ✅   |       |  ❌   |  
(W)  | ❌        |
+| Row group "Sorting column" metadata (2)      |  ✅   |  ❌   |       |  ✅   |  
(W)  | ❌        |
+| Row group pruning using statistics           |  ❌   |  ✅   |       |  ✅   |  
✅   | ❌        |
+| Row group pruning using bloom filter         |  ❌   |  ✅   |       |  ✅   |  
✅   | ❌        |
+| Reading select columns only                  |  ✅   |  ✅   |       |  ✅   |  
✅   | ✅        |
+| Page pruning using statistics                |  ❌   |  ✅   |       |  ✅   |  
❌   | ❌        |
 
 
 * \(1) In parquet.thrift: ColumnChunk->file_path

Reply via email to