paleolimbot commented on code in PR #9997:
URL: https://github.com/apache/arrow-rs/pull/9997#discussion_r3277385353


##########
parquet/src/file/metadata/thrift/mod.rs:
##########
@@ -621,6 +621,26 @@ fn read_column_chunk<'a>(
     Ok(col)
 }
 
+fn read_schema(prot: &mut ThriftSliceInputProtocol) -> 
Result<SchemaDescriptor> {
+    let mut schema = read_thrift_vec::<SchemaElement, 
ThriftSliceInputProtocol>(&mut *prot)?;
+    // An earlier version of this crate enforced this when decoding 
LogicalType. Now that
+    // the decoder is macro generated, we do this to preserve the original 
behavior.
+    // TODO: this was done due to a line in the spec saying an unset algorithm 
defaults
+    // to SPHERICAL. But there is no default in the Thrift, so it would be 
better to set the
+    // default when consumed.
+    // See 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#geography
+    for se in schema.iter_mut() {
+        match se.logical_type.as_mut() {
+            Some(LogicalType::Geography(g)) if g.algorithm.is_none() => {
+                g.algorithm = Some(Default::default());
+            }

Review Comment:
   You can do whatever is best for you here...the extension type translation 
and me ( 
https://github.com/apache/sedona-db/blob/c2613699056b67c001a7ef6899e8568f91bc6596/rust/sedona-geoparquet/src/metadata.rs#L657-L674
 ) are probably the only two actual users of this. That said, it *is* probably 
less confusing to always apply the Spherical default (most people consuming 
this API are non-spatial and have no idea what any of this means).
   
   A heads up that there's a bug in that extension type (I will fix but haven't 
gotten there yet): https://github.com/apache/arrow-rs/issues/9929



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to