This should be possible already, at least on git master but perhaps also in 2.0.0. Which problem are you encountering?
Le 09/01/2021 à 05:27, Steve Kim a écrit : > Is it possible to read Parquet columns into an Arrow schema that has > variable-width types with 64-bit offsets (LargeBinary, LargeList, etc.)? > > For my current use case, I prefer the large types because the data overflow > 32-bit offsets, and it is easier to waste memory with 8 bytes per offset > than it is to work with chunked arrays. (I need to access the Arrow buffers > from Java, and the Java library does not yet provide a convenient > abstraction for chunked arrays.) > > I would like an option to use large types when reading Parquet files with > the Dataset API. My feature request could be satisfied more generally by > enabling users to specify type coercion/promotion when mapping Parquet > types to Arrow types. > > Are other users interested in this feature? Is anyone opposed? > > Steve Kim >