bellabaiyunyu opened a new issue, #47155:
URL: https://github.com/apache/arrow/issues/47155

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hi team, 
   
   We are seeing the following error after upgrading to pyarrow 21.0.0. 
Downgrading to 20.0.0 resolves the issue. 
   
   I believe this is a backward incompatible change. Will team be releasing a 
patch to make the latest pyarrow backward compatible?
   ```
   Traceback (most recent call last):
     File "<frozen runpy>", line 198, in _run_module_as_main
     File "<frozen runpy>", line 88, in _run_code
     File 
"<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/scripts/performance/llm/pretrain_deepseek_v2_lite.py",
 line 21, in <module>
       from nemo.collections.llm.recipes.deepseek_v2_lite import pretrain_recipe
     File 
"<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/__init__.py",
 line 21, in <module>
       from nemo.collections.llm.bert.data import BERTMockDataModule, 
BERTPreTrainingDataModule, SpecterDataModule
     File 
"<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/bert/data/__init__.py",
 line 3, in <module>
       from nemo.collections.llm.bert.data.specter import SpecterDataModule
     File 
"<REDACTED_LOCAL_ENV>/build/workloads/nemo-dgxc-benchmarking-g_v25.04/NeMo/nemo/collections/llm/bert/data/specter.py",
 line 18, in <module>
       from datasets import DatasetDict, load_dataset
     File 
"<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/__init__.py", 
line 22, in <module>
       from .arrow_dataset import Dataset
     File 
"<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/arrow_dataset.py",
 line 67, in <module>
       from .arrow_writer import ArrowWriter, OptimizedTypedSequence
     File 
"<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/arrow_writer.py",
 line 27, in <module>
       from .features import Features, Image, Value
     File 
"<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/features/__init__.py",
 line 18, in <module>
       from .features import Array2D, Array3D, Array4D, Array5D, ClassLabel, 
Features, Sequence, Value
     File 
"<REDACTED_LOCAL_ENV>/env/lib/python3.12/site-packages/datasets/features/features.py",
 line 634, in <module>
       class _ArrayXDExtensionType(pa.PyExtensionType):
                                   ^^^^^^^^^^^^^^^^^^
   AttributeError: module 'pyarrow' has no attribute 'PyExtensionType'. Did you 
mean: 'ExtensionType'?
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to