Thanks everyone for their input;

Interoperability would be the biggest issue; how much does C++ do with the 
timezone string?

-Evan

> On Jul 7, 2021, at 1:33 PM, Weston Pace <weston.p...@gmail.com> wrote:
> 
> I don't know about removal but you could probably ignore the timezone
> string and it's not clear the issues would be that significant.
> 
> If Rust never produces a non-null non-UTC timestamp then I don't see
> that as an issue.
> 
> If you are consuming data with a timestamp string other than UTC it
> isn't really clear what information that timestamp string is supposed
> to convey anyways.  Are you supposed to extract fields as if you were
> in that time zone?  Or does this indicate the time zone the data was
> captured in?  Postgresql, etc. do not support this concept.  Probably
> the safest thing to do would be to reject the data.
> 
> There still remains the question of whether or not you need to
> distinguish between local times and instant times.  Or, in python
> terms, naive vs non-naive.  Or, in parquet terms, whether you need to
> worry about the isAdjustedToUtc flag.  Or, in postgres terms, whether
> you need to distinguish between "timestamp with timezone" and
> "timestamp without timezone".
> 
> This boils down to whether you want to support the constraints offered
> by these semantic hints from the user or not.  For example, forbidding
> comparison between the two types of timestamps or altering how you
> display them.  If those features are not important, then Rust could
> ignore the time zone field completely.  That could cause an
> interoperability issue though (e.g. data going into rust with timezone
> UTC comes back out with no timezone even though nothing changed).
> Ideally rust could ignore the time zone string but leave it unchanged.
> 
> On Wed, Jul 7, 2021 at 6:58 AM Joris Van den Bossche
> <jorisvandenboss...@gmail.com> wrote:
>> 
>> On Wed, 7 Jul 2021 at 18:46, Jorge Cardoso Leitão <jorgecarlei...@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> 
>>> AFAIK timezone is part of the spec.
>> 
>> 
>> And for reference, the current spec (Schema flatbuffer file) for timestamp
>> is at
>> https://github.com/apache/arrow/blob/6c8d30ea82222fd2750b999840872d3f6cbdc8f8/format/Schema.fbs#L217-L247.
>> 
>> 
>> 
>>> In Python, that would be [1]
>>> 
>>> import pyarrow as pa
>>> dt1 = pa.timestamp("ms", "+00:10")
>>> dt2 = pa.timestamp("ms")
>>> 
>>> arrow-rs is not very consistent with how it handles it. imo that is an
>>> artifact of being currently difficult (API wise) to create an array with a
>>> timezone, which have caused people to not use it much (and thus not
>>> implement kernels with it / test it properly).
>>> 
>>> I do not see how removing it would be compatible with the Arrow spec,
>>> though.
>>> 
>>> Best,
>>> Jorge
>>> 
>>> [1] https://arrow.apache.org/docs/python/generated/pyarrow.timestamp.html
>>> 
>>> 
>>> 
>>> On Wed, Jul 7, 2021 at 6:37 PM Evan Chan <e...@urbanlogiq.com> wrote:
>>> 
>>>> Hi folks,
>>>> 
>>>> Some of us are having a discussion about a direction change for Rust
>>> Arrow
>>>> timestamp types, which current support both a resolution field (Ns,
>>> Micros,
>>>> Ms, Seconds) similar to the other language implementations, but also
>>>> optionally a timezone string field.   I believe the timezone field is
>>>> unique to the Rust implementation, as I don’t find it in the C/C++ and
>>>> Python docs.   At the same time, in reality if the timezone field is non
>>>> null, this is not well supported at all in the current code.  Functions
>>>> returning timestamps pretty much all return a null timezone, for example,
>>>> and don’t allow the timezone to be specified.
>>>> 
>>>> The proposal would be to eliminate the timezone field and bring the Rust
>>>> Arrow timestamp type in line with that of the other language
>>>> implementations, also simplifying implementation.   It seems this is in
>>>> line with direction of other projects (Parquet, Spark, and most DBs have
>>>> timestamp types which do not have explicit timezones or are implicitly
>>> UTC).
>>>> 
>>>> Please feel free to see
>>>> https://github.com/apache/arrow-datafusion/issues/686 <
>>>> https://github.com/apache/arrow-datafusion/issues/686>
>>>> (Or would it be better to discuss here in mailing list?)
>>>> 
>>>> Cheers!
>>>> Evan
>>> 

Reply via email to