adamhooper commented on issue #686:
URL: 
https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876681175


   @velvia Great points
   
   My last two suggestions were exactly about interop -- and a transition 
period. Today, DataFusion users interpret timezone=null to mean `TIMESTAMP WITH 
TIMEZONE`; but eventually, DataFusion must consider timezone=null to mean 
`TIMESTAMP WITHOUT TIMEZONE`, right? Users will need to change what they're 
doing; the transition path will be hard and it depends on the DataFusion 
community.
   
   I hadn't thought of timestamp resolution. Again, I'm out of my element (for 
now) :).
   
   The crux of my suggestion is to ignore the `timezone` metadata field (treat 
it as a boolean, `timezone=null` or `timezone=UTC`). That translates Parquet 
<=> Arrow <=> PostgreSQL cleanly. Treating it like a boolean keeps the feature 
list small and saves people from confusion.
   
   As for the `TO_TIMESTAMP()` parameter: Postgres has a different function 
might do exactly what you're suggesting. [`AT TIME 
ZONE`](https://www.postgresql.org/docs/13/functions-datetime.html#FUNCTIONS-DATETIME-ZONECONVERT)
 _toggles_ the "`WITH TIMEZONE`-ness" of a timestamp. It's also callable as a 
function, `TIMEZONE(zone, timestamp)`.
   
   But -- sidetracking here -- how important are these types? `TIMESTAMP WITH 
TIME ZONE` is clearly mission-critical. I think `TIMESTAMP WITHOUT TIME ZONE` 
has its place; but it's in a different ballpark, right. (Even Spark doesn't 
have `TIMESTAMP WITHOUT TIME ZONE`; and, well, does DataFusion?) Arrow's 
timezone metadata column is even more obscure: I've only heard of it in Pandas 
and R.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to