It is a validation bug that you can read and write values to the column. My understanding of the use-case for the type is coming from more loosely typed systems that infer schemas on the fly and then write in the parquet. In these systems if a column contains all Null values then the actual type cannot be inferred, so Null logical type would in theory allow for schema evolution afterwards if/when the actual type is discovered when writing more data to different files. I believe one example of where this is used is when writing out Pandas dataframes to parquet.
Cheers, Micah On Tue, Feb 28, 2023 at 4:07 PM Jerry Adair <[email protected]> wrote: > Hi, > > I am just learning of the Parquet Null logical type. I've read the > documentation, as well as the brief inline commentary in the types header. > That states that the Null logical type can annotate any primitive type. > What I find confusing is that if I create a Parquet table with a primitive > type, say Int32 for example, and then assign it the Null logical type, I > can still write and then read values from that column. This leads me to a > more general question: what is the typical use case scenario for a Null > logical type? And how is it supposed to work and intended to be used? > > Thanks! >
