rusackas commented on PR #39756:
URL: https://github.com/apache/superset/pull/39756#issuecomment-4654976267

   Pushed this forward — rebased on latest `master` and verified it's landable. 
Summary of where it stands:
   
   **Changelog review (pyarrow 21 → 24):** The only change that actually 
touches how Superset uses pyarrow is UUID type inference. Since **pyarrow 21**, 
`pa.array([uuid.UUID(...)])` infers the canonical `arrow.uuid` extension type 
(16-byte fixed binary) instead of raising. Previously `SupersetResultSet` 
relied on that raise to route UUID values through its stringification fallback, 
so without handling them they'd surface in the results grid as garbled bytes. 
The other notable items (gandiva deprecation, removal of long-deprecated 
v13/v18 APIs) don't affect any code path we use. No `Table.to_pandas` signature 
changes (`integer_object_nulls`, `timestamp_as_object` still work), and 
`pyarrow.parquet` / `pyarrow.feather` / `pyarrow.lib.ArrowException` all still 
import.
   
   **Breakage found + fixed:** The UUID extension-type regression, in two 
places:
   - `SupersetResultSet` (SQL Lab, Explore/chart data, column introspection)
   - `superset/semantic_layers/mapper.py`, which round-trips through 
`Table.to_pandas()`
   
   Added a shared `stringify_extension_columns(table)` helper that converts any 
Arrow extension column to its string form (UUIDs → canonical hex) and applied 
it at both sites. Plain binary/BLOB columns aren't extension types, so they're 
untouched. Regression tests cover the helper plus an end-to-end UUID result set.
   
   **Dependency floor:** No hard conflict. pyarrow 24 installs cleanly against 
our pinned `pandas==2.1.4` / `numpy 1.26.4`, and `pyproject.toml` already 
declares `pyarrow>=24.0.0,<25` alongside `pandas>=2.1.4,<2.4`. Also added 
`apache-2.0` to the liccheck authorized-licenses list (a transitive dep reports 
that exact license string).
   
   **Tests (run locally with pyarrow 24.0.0 + pandas 2.1.4):**
   - `result_set` / `dataframe` / `arrow` / `semantic` unit tests: 471 passed, 
1 skipped
   - `columnar` / `uploader` / `hive` (parquet paths): 80 passed, 1 skipped
   - `pre-commit` (mypy, ruff, pylint) clean on all changed files
   
   Landable as-is. Not merging — leaving that to a committer.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to