I'm not a long-time Parquet user, but I assisted in the expansion of the parquet-cpp library's LogicalType facility.
My impression is that the original TIMESTAMP converted types were silent on whether the annotated value was UTC adjusted and that (often arcane) out-of-band information had to be relied on by readers to decide the UTC adjustment status for timestamp columns. It seemed to me that that perceived shortcoming was a primary motivator for adding the isAdjustedToUTC boolean parameter to the corresponding new Timestamp LogicalType. If that impression is accurate, then when reading TIMESTAMP columns written by legacy (converted type only) writers, it seems inappropriate for LogicalType aware readers to unconditionally assign *either* "false" or "true" (as currently required) to a boolean UTC adjusted parameter, as that requires the reader to infer a property that wasn't implied by the writer. One possible approach to untangling this might be to amend the parquet.thrift specification to change the isAdjustedToUTC boolean property to an enum or union type (some enumerated list) named (for example) UTCAdjustment with three possible values: Unknown, UTCAdjusted, NotUTCAdjusted (I'm not married to the names). Extant files with TIMESTAMP converted types only would map for forward compatibility to Timestamp LogicalTypes with UTCAdjustment:=Unknown . New files with user supplied Timestamp LogicalTypes would always record the converted type as TIMESTAMP for backward compatibility regardless of the value of the new UTCAdjustment parameter (this would be lossy on a round-trip through a legacy library, but that's unavoidable -- and the legacy libraries would be no worse off than they are now). The specification would normatively state that new user supplied Timestamp LogicalTypes SHOULD (or MUST?) use either UTCAdjusted or NotUTCAdjusted (discouraging the use of Unknown in new files). Thanks, Tim
