velvia commented on issue #597:
URL: https://github.com/apache/arrow-rs/issues/597#issuecomment-885241595
@alamb there is actually a way to create primitive timezone arrays in other
timezones. I have some code which I'll be PR'ing against both Arrow and
DataFusion which does this, but includes other things we'll need. However,
something like this:
```
let data = vec![Some(.....), Some(....)];
let array_utc = TimestampNanosecondArray::from_opt_vec(data,
Some("UTC".to_owned()));
```
You can use from_vec as well and it works too. I found other ways of
creating timezone-based arrays as well.
You don't need to create a type which has non-None timezone (yes I'm aware
of that limitation and was going to point it out too). The key is that the
underlying ArrayData has the correct, actual timezone, and this is what is
returned by the data_type() calls and checked dynamically. There is an
inconsistency there though, which I agree with, and solving that needs more
discussion.
Regarding switching to `&str`, some thoughts:
- Adding a ref/pointer will require adding lifetimes to type annotations,
which would be really annoying and a huge global change
- I actually think if we were to change the underlying timezone data type,
it should be a numeric offset, not a string. Strings take up a large amount of
space, at least 24 bytes on x86 + the actual storage of the string, and are
slow; furthermore every time we need to actually translate it to the offset,
which is more likely what we care about for computations. I'd prefer some kind
of `TimeOffset` which could simply be `Option<i32>` or something like that -
that's only 5 bytes. :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]