AnkitAhlawat7742 opened a new pull request, #49878: URL: https://github.com/apache/arrow/pull/49878
### Rationale for this change When converting a pandas.Categorical with tz-aware datetime categories to a PyArrow array, the timezone information was silently dropped from the dictionary array's value type. This is a silent data loss bug — no warning or error is raised, but the timezone metadata is lost. ### What changes are included in this PR? In `python/pyarrow/array.pxi`, the Categorical conversion was using `values.categories.values(raw numpy array) `which strips timezone metadata since numpy does not support tz-aware datetimes. Changed to values.categories (pandas Index) and added from_pandas=True so PyArrow uses the pandas conversion path, which correctly preserves timezone metadata. ### Are these changes tested? Yes. Verified manually ### Are there any user-facing changes? Yes — this is a bug fix. Users did #49875 This PR contains a **"Critical Fix"** — timezone information was lost silently during conversion without any warning or error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
