I think I'd be OK with this, I'm not sure if we've defined when we want to have log messages vs not. If you think this is useful, please open a JIRA and others can chime in.
FWIW, I think in the next release or two we will likely be switching the default to 2.4 or 2.6. -Micah On Fri, Jan 28, 2022 at 9:31 AM Grant Williams <[email protected]> wrote: > Thank you, Micah! That makes sense. > > Do you have any thoughts about maybe adding a logged warning if a user > calls write_table() and uint32() is in the given schema? > > On Fri, Jan 28, 2022 at 11:15 AM Micah Kornfield <[email protected]> > wrote: > >> Hi Grant, >> This is intended behavior because the default writing of parquet uses >> version 1 of logical types. Version 1 does not support annotating fields as >> uint32, so to preserve the values round trip they are cast to int64. If >> you wish to maintain the type setting the version kwarg to 2.4 or 2.6 [1] >> should work. >> >> Cheers, >> Micah >> >> >> [1] >> https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html >> >> On Fri, Jan 28, 2022 at 9:04 AM Grant Williams <[email protected]> >> wrote: >> >>> Hello, >>> >>> I've found that if you write a file that has a schema that specifies >>> column A as a uint32() type. If you read the file and inspect the schema it >>> will show Column A as int64(). This issue appears to be unique to the >>> uint32() type and I was unable to get any other type mismatches with the >>> other integer or float types. >>> >>> The following is a link to a gist showing a minimal code example and the >>> output from it: >>> https://gist.github.com/grantmwilliams/1ceb490312c59e4fb6e4bc15b57e9707. >>> >>> I'm not sure if this is a problem with the physical datatype being >>> actually written as int64, or if the metadata for the file is just wrong >>> instead. Does anyone have any idea what could be causing this? Or whether >>> it's just a metadata issue or an actual physical type error? >>> >>> Thanks, >>> Grant W. >>> -- >>> Grant Williams >>> Machine Learning Engineer >>> https://github.com/grantmwilliams/ >>> >> > > -- > Grant Williams > Machine Learning Engineer > https://github.com/grantmwilliams/ >
