Regarding ordering of extension types: The default order of a type is already defined to be logical type specific (see `TypeDefinedOrder` in parquet.thrift). Therefore, if we make ExtensionType a logical type, then by the current semantics of the Parquet spec, they will already be defined to come with their own order. The PR that adds ExtensionType should add a comment to `TypeDefinedOrder` that for an ExtensionType, the order is defined by the type itself.
Cheers, Jan Am Mi., 29. Mai 2024 um 09:10 Uhr schrieb Antoine Pitrou <anto...@python.org >: > On Wed, 29 May 2024 10:27:02 +0800 > Gang Wu <ust...@gmail.com> wrote: > > I think adding extension type support will make it easier for adding > > tensor or vector type, which is [1] trying to target. > > > > However, the geometry type seems not easy to fit to the imagination > > of the extension type. It would be better to explicitly define geospatial > > statistics in the spec, otherwise we have to encode them like > plain-encoded > > min/max values or even use thrift/protobuf to serialize them as binary > data. > > Let's remember here than PLAIN encoding for numeric scalars (such as > double or int64) is really a contiguous sequence of native > little-endian numbers, just like e.g. the Parquet footer length. > There's no need to explicitly invoke the PLAIN decoder, especially when > no def/rep levels are involved. > > Regards > > Antoine. > > >