bellabaiyunyu opened a new issue, #47155: URL: https://github.com/apache/arrow/issues/47155
### Describe the bug, including details regarding any error messages, version, and platform. Hi team, We are seeing the following error after upgrading to pyarrow 21.0.0. Downgrading to 20.0.0 resolves the issue. I believe this is a backward incompatible change. Will team be releasing a patch to make the latest pyarrow backward compatible? ``` Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/scripts/performance/llm/pretrain_deepseek_v2_lite.py", line 21, in <module> from nemo.collections.llm.recipes.deepseek_v2_lite import pretrain_recipe File "<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/__init__.py", line 21, in <module> from nemo.collections.llm.bert.data import BERTMockDataModule, BERTPreTrainingDataModule, SpecterDataModule File "<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/bert/data/__init__.py", line 3, in <module> from nemo.collections.llm.bert.data.specter import SpecterDataModule File "<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/bert/data/specter.py", line 18, in <module> from datasets import DatasetDict, load_dataset File "<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/__init__.py", line 22, in <module> from .arrow_dataset import Dataset File "<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 67, in <module> from .arrow_writer import ArrowWriter, OptimizedTypedSequence File "<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/arrow_writer.py", line 27, in <module> from .features import Features, Image, Value File "<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/features/__init__.py", line 18, in <module> from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, Features, Sequence, Value File "<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/features/features.py", line 634, in <module> class _ArrayXDExtensionType(pa.PyExtensionType): ^^^^^^^^^^^^^^^^^^ AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'. Did you mean: 'ExtensionType'? ``` ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org