The following code dies with pyarrow 0.14.2: import pyarrow as pa import pyarrow.parquet as pq
schema = pa.schema([('timestamp', pa.timestamp('ns', tz='UTC')),]) writer = pq.ParquetWriter('foo.parquet', schema, coerce_timestamps='ns') ts_array = pa.array([ int(1234567893141) ], type=pa.timestamp('ns', tz='UTC')) table = pa.Table.from_arrays([ ts_array ], names=['timestamp']) writer.write_table(table) writer.close() with the message: ValueError: Invalid value for coerce_timestamps: ns That appears to be because of this code in _parquet.pxi: cdef int _set_coerce_timestamps( self, ArrowWriterProperties.Builder* props) except -1: if self.coerce_timestamps == 'ms': props.coerce_timestamps(TimeUnit_MILLI) elif self.coerce_timestamps == 'us': props.coerce_timestamps(TimeUnit_MICRO) elif self.coerce_timestamps is not None: raise ValueError('Invalid value for coerce_timestamps: {0}' .format(self.coerce_timestamps)) which restricts the choice to 'ms' or 'us', even though AFAICT everywhere else also allows 'ns' (and there is a TimeUnit_NANO defined). Is this intentional, or a bug? Thanks, - db