sarutak opened a new pull request, #55744:
URL: https://github.com/apache/spark/pull/55744

   ### What changes were proposed in this pull request?
   This PR adds `follow_imports = skip` to the `[mypy-pyarrow.*]` section in 
`python/mypy.ini`.
   
   ### Why are the changes needed?
   PyArrow 24.0.0 (released 2026-04-21) added a `py.typed` marker and a 
placeholder `__init__.pyi`, making it a PEP 561 typed package. However, 
`pyarrow.compute` has no `.pyi` stub, and its functions (`floor_temporal`, 
`assume_timezone`, `local_timestamp`, etc.) are dynamically generated at 
runtime via `_make_global_functions()`. As a result, mypy 1.19.1 reports 
`attr-defined` errors for these functions:
   
   ```
   python/pyspark/sql/pandas/types.py:546: error: Module has no attribute 
"floor_temporal" [attr-defined]
   python/pyspark/sql/pandas/types.py:553: error: Module has no attribute 
"assume_timezone" [attr-defined]
   python/pyspark/sql/conversion.py:1409: error: Module has no attribute 
"local_timestamp" [attr-defined]
   python/pyspark/sql/conversion.py:1439: error: Module has no attribute 
"local_timestamp" [attr-defined]
   ```
   
   This issue has already affected the following CIs.
   https://github.com/apache/spark/actions/runs/25493241154/job/74815987155 (CI 
for branch-4.x)
   https://github.com/apache/spark/actions/runs/25499265653/job/74833928460 (CI 
for branch-4.x)
   
   This issue is tracked upstream Arrow as 
https://github.com/apache/arrow/issues/48970.
   
   Since the CI Dockerfile specifies `pyarrow>=23.0.0`, PyArrow 24.0.0 will be 
installed on the next image rebuild, breaking the mypy lint check.
   
   Note: The master branch CI currently uses a cached Docker image that still 
has PyArrow 23.x installed (the image was last built on 2026-03-16, before 
PyArrow 24.0.0 was released on 2026-04-21). The same error will surface on 
master once `FULL_REFRESH_DATE` is updated and the image is rebuilt.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Ran mypy locally with PyArrow 24.0.0 installed:
   
   ```bash
   pip install pyarrow==24.0.0
   mypy --python-executable python3 --namespace-packages --config-file 
python/mypy.ini python/pyspark
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Kiro CLI / Opus 4.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to