Re: Arrow 15 parquet nanosecond change

Li Jin Wed, 21 Feb 2024 13:14:23 -0800

Thanks - since we don’t control all the invocation of pq.write_table, I
wonder if there is some configuration for the “default” behavior?


Also I wonder if there are other API surface that is exponentially exposed
to this, e.g., dataset or pd.Dataframe.to_parquet ?

Thanks!
Li

On Wed, Feb 21, 2024 at 3:53 PM Jacek Pliszka <jacek.plis...@gmail.com>
wrote:

> Hi!
>
>             pq.write_table(
>                 table, config.output_filename, coerce_timestamps="us",
>                 allow_truncated_timestamps=True,
>             )
>
> allows you to write as us instead of ns.
>
> BR
>
> J
>
>
> śr., 21 lut 2024 o 21:44 Li Jin <ice.xell...@gmail.com> napisał(a):
>
> > Hi,
> >
> > My colleague has informed me that during the Arrow 12->15 upgrade, he
> found
> > that writing a pandas Dataframe with datetime64[ns] to parquet will
> result
> > in nanosecond metadata and nanosecond values.
> >
> > I wonder if this is something configurable to the old behavior so we can
> > enable “nanosecond in parquet” gradually? There are code that reads
> parquet
> > files that don’t handle parquet nanosecond now.
> >
> > Thanks!
> > Li
> >
>

Re: Arrow 15 parquet nanosecond change

Reply via email to