[GitHub] [arrow] rileyhun opened a new issue #12396: Error: Unable to read map data type

GitBox Thu, 10 Feb 2022 11:54:55 -0800


rileyhun opened a new issue #12396:
URL: https://github.com/apache/arrow/issues/12396



   We have some data stored in parquet file format from a `pyspark` pipeline 
and we are trying to read it in using `pyarrow`. Unfortunately, `pyarrow' is 
not able to interpret one of the stored data types. Would prefer being able to 
read in the data without relying on `pyspark`. I am using `pyarrow=7.0`
   
   Example:
   
   ```
   import s3fs
   import pyarrow.parquet as pq
   
   fs = s3fs.S3FileSystem()
   bucket_uri = 's3://data/batch=1000doc/part=0'
   
   dataset = pq.ParquetDataset(bucket_uri, filesystem=fs)
   table = dataset.read()
   table.to_pandas()
   ```
   
   Error:
   
   ```
   ArrowNotImplementedError: Not implemented type for Arrow list to pandas: 
map<string, double>
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] rileyhun opened a new issue #12396: Error: Unable to read map data type

Reply via email to