zeroshade commented on issue #2843: URL: https://github.com/apache/arrow-adbc/issues/2843#issuecomment-2907209749
Ah, so the most likely issue here is the way we do ingestion for Snowflake. Snowflake doesn't provide a native Arrow ingestion method, so adbc_ingest writes out Parquet files and then uses `COPY INTO` on the Snowflake side to copy the data into the snowflake table from the uploaded Parquet files. In the best case scenario, we'd be attempting to ingest parquet files with a column with the `timeAdjustedToUTC` flag set to true in the logical type. Now, according to the Snowflake docs: > TIMESTAMP_TZ internally stores UTC time together with an associated time zone offset. When a time zone isn’t provided, the session time zone offset is used. So, assuming the uploaded parquet file has the proper logical type, it would come down to how Snowflake handles the copy-into of the parquet file into the column since Parquet doesn't have a per-row time zone either. Essentially one of the following is happening: 1. Polars generates the appropriate Arrow type with `Timestamp[ns, UTC]`, we generate parquet files with the type `Timestamp[isAdjustedToUTC=true,unit=NANOS]` for ingestion, and Snowflake doesn't respect the logical type. As a result, it assumes it doesn't have an associated time zone offset and assigns the session time zone (the account time zone is the default for the session) for the values. 2. Polars generates Arrow data with an empty timezone instead of "UTC", resulting in the parquet files being written with `isAdjustedToUTC=false`. Snowflake then acts accordingly by assuming the values are in the session/account time zone. In scenario 1: I would argue the issue is on Snowflake's side and they would have to address it. In scenario 2: Polars should be fixed so that it generates the type using the "UTC" timezone explicitly, and then it would still remain to be seen if snowflake would respect that or not. In either case, another option could be to simply set the session timezone to UTC before running the ingest which would have it at least apply the correct time zone in this case. But isn't a good general solution. @lidavidm @CurtHagenlocher I'm curious what your thoughts are on the above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org