rileyhun opened a new issue #12396:
URL: https://github.com/apache/arrow/issues/12396


   We have some data stored in parquet file format from a `pyspark` pipeline 
and we are trying to read it in using `pyarrow`. Unfortunately, `pyarrow' is 
not able to interpret one of the stored data types. Would prefer being able to 
read in the data without relying on `pyspark`. I am using `pyarrow=7.0`
   
   Example:
   
   ```
   import s3fs
   import pyarrow.parquet as pq
   
   fs = s3fs.S3FileSystem()
   bucket_uri = 's3://data/batch=1000doc/part=0'
   
   dataset = pq.ParquetDataset(bucket_uri, filesystem=fs)
   table = dataset.read()
   table.to_pandas()
   ```
   
   Error:
   
   ```
   ArrowNotImplementedError: Not implemented type for Arrow list to pandas: 
map<string, double>
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to