alamb commented on code in PR #123:
URL: https://github.com/apache/parquet-site/pull/123#discussion_r2457311328


##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -43,30 +49,43 @@ Implementations:
 
 ### Logical types
 
-| Data type                                 | arrow | parquet-java  | arrow-go 
| arrow-rs | cudf  | hyparquet | duckdb |
-| ----------------------------------------- | ----- | ------------- | -------- 
| -------- | ----- | --------- | ------ |
-| STRING                                    |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| ENUM                                      |  ❌   |  ✅           |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| UUID                                      |  ❌   |  ✅           |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| 8, 16, 32, 64 bit signed and unsigned INT |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (INT32)                           |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (INT64)                           |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (BYTE_ARRAY)                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   (R)  |
-| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DATE                                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIME (INT32)                              |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIME (INT64)                              |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIMESTAMP (INT64)                         |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| INTERVAL                                  |  ✅   |  ✅ (1)       |  ✅      |  
✅      |  ❌   |  ✅       |   ✅   |
-| JSON                                      |  ✅   |  ✅ (1)       |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| BSON                                      |  ❌   |  ✅ (1)       |  ✅      |  
✅ (1)  |  ❌   |  ❌       |   ❌   |
-| LIST                                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  (R)      |   ✅   |
-| MAP                                       |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  (R)      |   ✅   |
-| UNKNOWN (always null)                     |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| FLOAT16                                   |  ✅   |  ✅ (1)       |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
+Logical types are defined by the [`union LogicalType` in parquet.thrift] and 
described in [LogicalTypes.md]
+
+[`union LogicalType` in parquet.thrift]: 
https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L471
+[LogicalTypes.md]: 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
+
+| Data type                               | arrow | parquet-java | arrow-go | 
arrow-rs | cudf | hyparquet | duckdb |
+|-----------------------------------------|------| ------- | ------- | ------- 
| ---- | -------- |--------|
+| STRING                                  | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| ENUM                                    | ❌    |  ✅     |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| UUID                                    | ❌    |  ✅     |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| 8, 16, 32, 64 bit signed and unsigned INT | ✅    |  ✅     |  ✅     |  ✅     
|  ✅  |  ✅      | ✅      |
+| DECIMAL (INT32)                         | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DECIMAL (INT64)                         | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DECIMAL (BYTE_ARRAY)                    | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | (R)    |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)          | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| FLOAT16                                 | ✅    |  ✅ (1) |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DATE                                    | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIME (INT32)                            | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIME (INT64)                            | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIMESTAMP (INT64)                       | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| INTERVAL                                | ✅    |  ✅ (1) |  ✅     |  ✅     |  
❌  |  ✅      | ✅      |
+| JSON                                    | ✅    |  ✅ (1) |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| BSON                                    | ❌    |  ✅ (1) |  ✅     |  ✅ (1) |  
❌  |  ❌      | ❌      |
+| [VARIANT]                               |      |        |        |        |  
   |         |        |
+| [GEOMETRY]                              |      |        |        |        |  
   |         |        |
+| [GEOGRAPHY]                             |      |        |        |        |  
   |         |        |

Review Comment:
   Indeed it seems to be better after `load spatial`
   
   ```sql
   D LOAD spatial;
   D from 'geospatial.parquet';
   IO Error:
   No files found that match the pattern "geospatial.parquet"
   D from 'geospatial/geospatial.parquet';
   
┌──────────────────────┬──────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   │        group         │         wkt          │                              
                        geometry                                                
      │
   │       varchar        │       varchar        │                              
                        geometry                                                
      │
   
├──────────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │ all                  │ POINT (30 10)        │ POINT (30 10)                
                                                                                
      │
   │ all                  │ LINESTRING (30 10,…  │ LINESTRING (30 10, 10 30, 40 
40)                                                                             
      │
   │ all                  │ POLYGON ((30 10, 4…  │ POLYGON ((30 10, 40 40, 20 
40, 10 20, 30 10))                                                              
        │
   │ all                  │ MULTIPOINT ((30 10)) │ MULTIPOINT (30 10)           
                                                                                
      │
   │ all                  │ MULTILINESTRING ((…  │ MULTILINESTRING ((30 10, 10 
30, 40 40))                                                                     
       │
   │ all                  │ MULTIPOLYGON (((30…  │ MULTIPOLYGON (((30 10, 40 
40, 20 40, 10 20, 30 10)))                                                      
         │
   │ all                  │ GEOMETRYCOLLECTION…  │ GEOMETRYCOLLECTION (POINT 
(30 10), LINESTRING (30 10, 10 30, 40 40), POLYGON ((30 10, 40 40, 20 40, 10 
20, 30 10…  │
   │ all                  │ POINT Z (30 10 40)   │ POINT Z (30 10 40)           
                                                                                
      │
   │ all                  │ LINESTRING Z (30 1…  │ LINESTRING Z (30 10 40, 10 
30 40, 40 40 80)                                                                
        │
   │ all                  │ POLYGON Z ((30 10 …  │ POLYGON Z ((30 10 40, 40 40 
80, 20 40 60, 10 20 30, 30 10 40))                                              
       │
   │ multilinestring-zm   │ MULTILINESTRING ZM…  │ MULTILINESTRING ZM EMPTY     
                                                                                
     
   ...
   
├──────────────────────┴──────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │ 196 rows (40 shown)                                                        
                                                                            3 
columns │
   
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
   ```



##########
content/en/docs/File Format/implementationstatus.md:
##########
@@ -43,30 +49,43 @@ Implementations:
 
 ### Logical types
 
-| Data type                                 | arrow | parquet-java  | arrow-go 
| arrow-rs | cudf  | hyparquet | duckdb |
-| ----------------------------------------- | ----- | ------------- | -------- 
| -------- | ----- | --------- | ------ |
-| STRING                                    |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| ENUM                                      |  ❌   |  ✅           |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| UUID                                      |  ❌   |  ✅           |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| 8, 16, 32, 64 bit signed and unsigned INT |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (INT32)                           |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (INT64)                           |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DECIMAL (BYTE_ARRAY)                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   (R)  |
-| DECIMAL (FIXED_LEN_BYTE_ARRAY)            |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| DATE                                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIME (INT32)                              |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIME (INT64)                              |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| TIMESTAMP (INT64)                         |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| INTERVAL                                  |  ✅   |  ✅ (1)       |  ✅      |  
✅      |  ❌   |  ✅       |   ✅   |
-| JSON                                      |  ✅   |  ✅ (1)       |  ✅      |  
✅ (1)  |  ❌   |  ✅       |   ✅   |
-| BSON                                      |  ❌   |  ✅ (1)       |  ✅      |  
✅ (1)  |  ❌   |  ❌       |   ❌   |
-| LIST                                      |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  (R)      |   ✅   |
-| MAP                                       |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  (R)      |   ✅   |
-| UNKNOWN (always null)                     |  ✅   |  ✅           |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
-| FLOAT16                                   |  ✅   |  ✅ (1)       |  ✅      |  
✅      |  ✅   |  ✅       |   ✅   |
+Logical types are defined by the [`union LogicalType` in parquet.thrift] and 
described in [LogicalTypes.md]
+
+[`union LogicalType` in parquet.thrift]: 
https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L471
+[LogicalTypes.md]: 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
+
+| Data type                               | arrow | parquet-java | arrow-go | 
arrow-rs | cudf | hyparquet | duckdb |
+|-----------------------------------------|------| ------- | ------- | ------- 
| ---- | -------- |--------|
+| STRING                                  | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| ENUM                                    | ❌    |  ✅     |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| UUID                                    | ❌    |  ✅     |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| 8, 16, 32, 64 bit signed and unsigned INT | ✅    |  ✅     |  ✅     |  ✅     
|  ✅  |  ✅      | ✅      |
+| DECIMAL (INT32)                         | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DECIMAL (INT64)                         | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DECIMAL (BYTE_ARRAY)                    | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | (R)    |
+| DECIMAL (FIXED_LEN_BYTE_ARRAY)          | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| FLOAT16                                 | ✅    |  ✅ (1) |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| DATE                                    | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIME (INT32)                            | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIME (INT64)                            | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| TIMESTAMP (INT64)                       | ✅    |  ✅     |  ✅     |  ✅     |  
✅  |  ✅      | ✅      |
+| INTERVAL                                | ✅    |  ✅ (1) |  ✅     |  ✅     |  
❌  |  ✅      | ✅      |
+| JSON                                    | ✅    |  ✅ (1) |  ✅     |  ✅ (1) |  
❌  |  ✅      | ✅      |
+| BSON                                    | ❌    |  ✅ (1) |  ✅     |  ✅ (1) |  
❌  |  ❌      | ❌      |
+| [VARIANT]                               |      |        |        |        |  
   |         |        |
+| [GEOMETRY]                              |      |        |        |        |  
   |         |        |
+| [GEOGRAPHY]                             |      |        |        |        |  
   |         |        |

Review Comment:
   Indeed it seems to be better after `load spatial`
   
   ```sql
   D LOAD spatial;
   No files found that match the pattern "geospatial.parquet"
   D from 'geospatial/geospatial.parquet';
   
┌──────────────────────┬──────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   │        group         │         wkt          │                              
                        geometry                                                
      │
   │       varchar        │       varchar        │                              
                        geometry                                                
      │
   
├──────────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │ all                  │ POINT (30 10)        │ POINT (30 10)                
                                                                                
      │
   │ all                  │ LINESTRING (30 10,…  │ LINESTRING (30 10, 10 30, 40 
40)                                                                             
      │
   │ all                  │ POLYGON ((30 10, 4…  │ POLYGON ((30 10, 40 40, 20 
40, 10 20, 30 10))                                                              
        │
   │ all                  │ MULTIPOINT ((30 10)) │ MULTIPOINT (30 10)           
                                                                                
      │
   │ all                  │ MULTILINESTRING ((…  │ MULTILINESTRING ((30 10, 10 
30, 40 40))                                                                     
       │
   │ all                  │ MULTIPOLYGON (((30…  │ MULTIPOLYGON (((30 10, 40 
40, 20 40, 10 20, 30 10)))                                                      
         │
   │ all                  │ GEOMETRYCOLLECTION…  │ GEOMETRYCOLLECTION (POINT 
(30 10), LINESTRING (30 10, 10 30, 40 40), POLYGON ((30 10, 40 40, 20 40, 10 
20, 30 10…  │
   │ all                  │ POINT Z (30 10 40)   │ POINT Z (30 10 40)           
                                                                                
      │
   │ all                  │ LINESTRING Z (30 1…  │ LINESTRING Z (30 10 40, 10 
30 40, 40 40 80)                                                                
        │
   │ all                  │ POLYGON Z ((30 10 …  │ POLYGON Z ((30 10 40, 40 40 
80, 20 40 60, 10 20 30, 30 10 40))                                              
       │
   │ multilinestring-zm   │ MULTILINESTRING ZM…  │ MULTILINESTRING ZM EMPTY     
                                                                                
     
   ...
   
├──────────────────────┴──────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │ 196 rows (40 shown)                                                        
                                                                            3 
columns │
   
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to