AlenkaF commented on a change in pull request #12522:
URL: https://github.com/apache/arrow/pull/12522#discussion_r817522143



##########
File path: python/pyarrow/tests/strategies.py
##########
@@ -224,6 +231,9 @@ def _pymap(draw, key_type, value_type, size, nullable=True):
 
 @st.composite
 def arrays(draw, type, size=None, nullable=True):
+    pytest.importorskip("pytz")

Review comment:
       There are couple of issues I am tying to solve to make this work:
   1. It seems all the tests in `test_strategies.py` are being skipped?
   2. The code I need to change to use `zoneinfo` is here:
   
https://github.com/apache/arrow/blob/69682ec944d053a60316c0af903a167eca0f24db/python/pyarrow/tests/strategies.py#L263-L274
   
   but: * I would need to install `timedelta,` which we do not depend on 
currently, to get the offset into a `timedelta` instance. ** The current code 
for the offset doesn't seem to work if I play with it locally. Example:
   
   ```python
   >>> import pyarrow as pa
   >>> 
   >>> ty = pa.timestamp('s', tz='+07:30')
   >>> pa.types.is_timestamp(ty)
   True
   
   >>> offset_hours = int(ty.tz) # This errors
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ValueError: invalid literal for int() with base 10: '+07:30'
   
   >>> import pytz
   >>> tz = pytz.timezone(ty.tz) # This also errors
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File 
"/Users/alenkafrim/repos/pyarrow-dev-9/lib/python3.9/site-packages/pytz/__init__.py",
 line 188, in timezone
       raise UnknownTimeZoneError(zone)
   pytz.exceptions.UnknownTimeZoneError: '+07:30'
   ```
   This works:
   ```python
   >>> ty = pa.timestamp('s', tz='America/New_York')
   >>> pa.types.is_timestamp(ty)
   True
   >>> pytz.timezone(ty.tz)
   <DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>
   ```
   
   My idea for a current change would be as follows. But it uses `timedelta` 
and the tests are being skipped so I am not sure if this works.
   ```python
     elif pa.types.is_timestamp(ty):
         min_int64 = -(2**63)
         max_int64 = 2**63 - 1
         min_datetime = datetime.datetime.fromtimestamp(min_int64 // 10**9)
         max_datetime = datetime.datetime.fromtimestamp(max_int64 // 10**9)
         try:
             offset = ty.tz.split(":")
             offset_hours = int(offset[0])
             offset_min = int(offset[1])
             tz = timedelta(hours=offset_hours, minutes=offset_min)
         except ValueError:
             tz = ZoneInfo(ty.tz)
         value = st.datetimes(timezones=st.just(tz), min_value=min_datetime,
                              max_value=max_datetime)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to