jorisvandenbossche commented on code in PR #38179:
URL: https://github.com/apache/arrow/pull/38179#discussion_r1524470302


##########
python/pyarrow/tests/test_util.py:
##########
@@ -207,3 +209,19 @@ def test_signal_refcycle():
         assert wr() is not None
         _break_traceback_cycle_from_frame(sys._getframe(0))
         assert wr() is None
+
+
[email protected](sys.platform != "win32",
+                    reason="Timezone database is already provided.")
+def test_download_tzdata_on_windows():
+    # Download timezone database
+    download_tzdata_on_windows()
+
+    # Inspect the folder
+    tzdata_path = os.path.expandvars(r"%USERPROFILE%\Downloads\tzdata")
+    tzdata_zones_path = os.path.join(tzdata_path, "windowsZones.xml")
+    assert os.path.exists(tzdata_path)

Review Comment:
   Can you check that this path does _not_ yet exist before calling 
`download_tzdata_on_windows` ? 
   
   Because I am wondering that on our windows builds, didn't we already 
download the data in the setup scripts to ensure we can run the tz tests? In 
that case this test might not actually be testing that it was downloaded



##########
python/pyarrow/util.py:
##########
@@ -228,3 +228,31 @@ def _break_traceback_cycle_from_frame(frame):
         # us visit the outer frame).
         refs = gc.get_referrers(frame)
     refs = frame = this_frame = None
+
+
+def download_tzdata_on_windows():
+    """
+    Download and extract latest IANA timezone database into the
+    location expected by Arrow which is %USERPROFILE%\\Downloads\tzdata.

Review Comment:
   ```suggestion
       location expected by Arrow which is %USERPROFILE%\Downloads\tzdata.
   ```



##########
python/pyarrow/tests/test_util.py:
##########
@@ -207,3 +209,19 @@ def test_signal_refcycle():
         assert wr() is not None
         _break_traceback_cycle_from_frame(sys._getframe(0))
         assert wr() is None
+
+
[email protected](sys.platform != "win32",
+                    reason="Timezone database is already provided.")
+def test_download_tzdata_on_windows():
+    # Download timezone database
+    download_tzdata_on_windows()
+
+    # Inspect the folder
+    tzdata_path = os.path.expandvars(r"%USERPROFILE%\Downloads\tzdata")
+    tzdata_zones_path = os.path.join(tzdata_path, "windowsZones.xml")
+    assert os.path.exists(tzdata_path)
+    assert os.path.exists(tzdata_zones_path)
+    # 30 files in tzdata (2023) + compressed tzdata.tar.gz + windowsZones.xml
+    assert len(os.listdir(tzdata_path)) == 32

Review Comment:
   We might not want to hardcode a number here given that we fetch the latest 
version, and so this number can then change at any time?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to