etseidl commented on code in PR #34:
URL: https://github.com/apache/parquet-site/pull/34#discussion_r1633838679


##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -0,0 +1,101 @@
+---
+title: "Implementation status"
+linkTitle: "Implementation status"
+weight: 8
+---
+### Physical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| BOOLEAN                                   |       |        |       |       |
+| INT32                                     |       |        |       |       |
+| INT64                                     |       |        |       |       |
+| INT96                                     |       |        |       |       |
+| FLOAT                                     |       |        |       |       |
+| DOUBLE                                    |       |        |       |       |
+| BYTE_ARRAY                                |       |        |       |       |
+| FIXED_LEN_BYTE_ARRAY                      |       |        |       |       |
+
+### Logical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| STRING                                    |       |        |       |       |
+| ENUM                                      |       |        |       |       |
+| UUID                                      |       |        |       |       |
+| 8 and 16 bit signed INT                   |       |        |       |       |
+| 8, 16, 32, 64 bit unsigned INT            |       |        |       |       |
+| DECIMAL (INT32)                           |       |        |       |       |
+| DECIMAL (INT64)                           |       |        |       |       |
+| DECIMAL (BYTE_ARRAY)                      |       |        |       |       |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |       |        |       |       |
+| DATE                                      |       |        |       |       |
+| TIME (INT32)                              |       |        |       |       |
+| TIME (INT64)                              |       |        |       |       |
+| TIMESTAMP (INT32)                         |       |        |       |       |
+| TIMESTAMP (INT64)                         |       |        |       |       |
+| INTERVAL                                  |       |        |       |       |
+| JSON                                      |       |        |       |       |
+| BSON                                      |       |        |       |       |
+| LIST                                      |       |        |       |       |
+| MAP                                       |       |        |       |       |
+| UNKNOWN                                   |       |        |       |       |
+
+### Encoding
+
+| Encoding                                  | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| PLAIN                                     |       |        |       |       |
+| PLAIN_DICTIONARY                          |       |        |       |       |
+| RLE_DICTIONARY                            |       |        |       |       |
+| RLE                                       |       |        |       |       |
+| BIT_PACKED                                |       |        |       |       |
+| DELTA_BINARY_PACKED                       |       |        |       |       |
+| DELTA_LENGTH_BYTE_ARRAY                   |       |        |       |       |
+| DELTA_BYTE_ARRAY                          |       |        |       |       |
+| BYTE_STREAM_SPLIT                         |       |        |       |       |

Review Comment:
   Should this be split into float/double and int/fixed_len_byte_array, or just 
use notes if an implementation doesn't yet support the expanded set of data 
types?



##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -0,0 +1,101 @@
+---
+title: "Implementation status"
+linkTitle: "Implementation status"
+weight: 8
+---
+### Physical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| BOOLEAN                                   |       |        |       |       |
+| INT32                                     |       |        |       |       |
+| INT64                                     |       |        |       |       |
+| INT96                                     |       |        |       |       |

Review Comment:
   Should it be noted that INT96 is deprecated?



##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -0,0 +1,101 @@
+---
+title: "Implementation status"
+linkTitle: "Implementation status"
+weight: 8
+---
+### Physical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| BOOLEAN                                   |       |        |       |       |
+| INT32                                     |       |        |       |       |
+| INT64                                     |       |        |       |       |
+| INT96                                     |       |        |       |       |
+| FLOAT                                     |       |        |       |       |
+| DOUBLE                                    |       |        |       |       |
+| BYTE_ARRAY                                |       |        |       |       |
+| FIXED_LEN_BYTE_ARRAY                      |       |        |       |       |
+
+### Logical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| STRING                                    |       |        |       |       |
+| ENUM                                      |       |        |       |       |
+| UUID                                      |       |        |       |       |
+| 8 and 16 bit signed INT                   |       |        |       |       |
+| 8, 16, 32, 64 bit unsigned INT            |       |        |       |       |
+| DECIMAL (INT32)                           |       |        |       |       |
+| DECIMAL (INT64)                           |       |        |       |       |
+| DECIMAL (BYTE_ARRAY)                      |       |        |       |       |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |       |        |       |       |
+| DATE                                      |       |        |       |       |
+| TIME (INT32)                              |       |        |       |       |
+| TIME (INT64)                              |       |        |       |       |
+| TIMESTAMP (INT32)                         |       |        |       |       |
+| TIMESTAMP (INT64)                         |       |        |       |       |
+| INTERVAL                                  |       |        |       |       |
+| JSON                                      |       |        |       |       |
+| BSON                                      |       |        |       |       |
+| LIST                                      |       |        |       |       |
+| MAP                                       |       |        |       |       |
+| UNKNOWN                                   |       |        |       |       |
+
+### Encoding
+
+| Encoding                                  | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| PLAIN                                     |       |        |       |       |
+| PLAIN_DICTIONARY                          |       |        |       |       |
+| RLE_DICTIONARY                            |       |        |       |       |
+| RLE                                       |       |        |       |       |
+| BIT_PACKED                                |       |        |       |       |
+| DELTA_BINARY_PACKED                       |       |        |       |       |
+| DELTA_LENGTH_BYTE_ARRAY                   |       |        |       |       |
+| DELTA_BYTE_ARRAY                          |       |        |       |       |
+| BYTE_STREAM_SPLIT                         |       |        |       |       |
+
+### Compression
+
+| Compression                               | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| UNCOMPRESSED                              |       |        |       |       |
+| SNAPPY                                    |       |        |       |       |
+| GZIP                                      |       |        |       |       |
+| LZO                                       |       |        |       |       |
+| BROTLI                                    |       |        |       |       |
+| LZ4                                       |       |        |       |       |
+| ZSTD                                      |       |        |       |       |
+| LZ4_RAW                                   |       |        |       |       |
+
+### Other format level features
+
+|                                           | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| xxHash Bloom filters                      |       |        |       |       |
+| bloom filter length                       |       |        |       |       |
+| Statistics min_value, max_value           |       |        |       |       |
+| Column index                              |       |        |       |       |
+| Offset index                              |       |        |       |       |
+| Modular encryption                        |       |        |       |       |
+| Page CRC32 checksum                       |       |        |       |       |
+| Modular encryption                        |       |        |       |       |

Review Comment:
   Add Size Statistics (https://github.com/apache/parquet-format/pull/197)?



##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -0,0 +1,101 @@
+---
+title: "Implementation status"
+linkTitle: "Implementation status"
+weight: 8
+---
+### Physical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| BOOLEAN                                   |       |        |       |       |
+| INT32                                     |       |        |       |       |
+| INT64                                     |       |        |       |       |
+| INT96                                     |       |        |       |       |
+| FLOAT                                     |       |        |       |       |
+| DOUBLE                                    |       |        |       |       |
+| BYTE_ARRAY                                |       |        |       |       |
+| FIXED_LEN_BYTE_ARRAY                      |       |        |       |       |
+
+### Logical types
+
+| Data type                                 | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| STRING                                    |       |        |       |       |
+| ENUM                                      |       |        |       |       |
+| UUID                                      |       |        |       |       |
+| 8 and 16 bit signed INT                   |       |        |       |       |
+| 8, 16, 32, 64 bit unsigned INT            |       |        |       |       |
+| DECIMAL (INT32)                           |       |        |       |       |
+| DECIMAL (INT64)                           |       |        |       |       |
+| DECIMAL (BYTE_ARRAY)                      |       |        |       |       |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |       |        |       |       |
+| DATE                                      |       |        |       |       |
+| TIME (INT32)                              |       |        |       |       |
+| TIME (INT64)                              |       |        |       |       |
+| TIMESTAMP (INT32)                         |       |        |       |       |
+| TIMESTAMP (INT64)                         |       |        |       |       |
+| INTERVAL                                  |       |        |       |       |
+| JSON                                      |       |        |       |       |
+| BSON                                      |       |        |       |       |
+| LIST                                      |       |        |       |       |
+| MAP                                       |       |        |       |       |
+| UNKNOWN                                   |       |        |       |       |
+
+### Encoding
+
+| Encoding                                  | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| PLAIN                                     |       |        |       |       |
+| PLAIN_DICTIONARY                          |       |        |       |       |
+| RLE_DICTIONARY                            |       |        |       |       |
+| RLE                                       |       |        |       |       |
+| BIT_PACKED                                |       |        |       |       |
+| DELTA_BINARY_PACKED                       |       |        |       |       |
+| DELTA_LENGTH_BYTE_ARRAY                   |       |        |       |       |
+| DELTA_BYTE_ARRAY                          |       |        |       |       |
+| BYTE_STREAM_SPLIT                         |       |        |       |       |
+
+### Compression
+
+| Compression                               | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| UNCOMPRESSED                              |       |        |       |       |
+| SNAPPY                                    |       |        |       |       |
+| GZIP                                      |       |        |       |       |
+| LZO                                       |       |        |       |       |
+| BROTLI                                    |       |        |       |       |
+| LZ4                                       |       |        |       |       |
+| ZSTD                                      |       |        |       |       |
+| LZ4_RAW                                   |       |        |       |       |
+
+### Other format level features
+
+|                                           | C++   | Java   | Go    | Rust  |
+| ----------------------------------------- | ----- | ------ | ----- | ----- |
+| xxHash Bloom filters                      |       |        |       |       |
+| bloom filter length                       |       |        |       |       |
+| Statistics min_value, max_value           |       |        |       |       |
+| Column index                              |       |        |       |       |
+| Offset index                              |       |        |       |       |

Review Comment:
   Should these be combined as Page Indexes?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to