wgtmac commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1600961998


##########
src/main/thrift/parquet.thrift:
##########
@@ -270,8 +270,11 @@ struct Statistics {
     * may set min_value="B", max_value="C". Such more compact values must 
still be
     * valid values within the column's logical type.
     *
-    * Values are encoded using PLAIN encoding, except that variable-length byte
-    * arrays do not include a length prefix.
+    * Values are encoded using PLAIN encoding, except that:
+    * 1) variable-length byte arrays do not include a length prefix.
+    * 2) geometry logical type with BoundingBoxOrder uses max_value/min_value 
pair

Review Comment:
   Yes, option 1 is more efficient but option 2 might be easier for different 
parquet impls. parquet-mr (which is the java impl of parquet) will depend on 
JTS and it is pretty natural to accept JTS Geometry as input data. However, for 
other parquet impls (e.g. parquet-cpp from arrow cpp, or parquet rust from 
arrow-rs), perhaps we need to leverage GeoArrow?
   
   cc @pitrou @mapleFU @tustvold @zeroshade @etseidl to get awareness of this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to