[ 
https://issues.apache.org/jira/browse/ARROW-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nolo Ogbirner updated ARROW-9818:
---------------------------------
    Summary: Obscure C++ Error when Calling to_pandas on a RecordBatch  (was: 
Obscure C++ Error when Callign to_pandas on a RecordBatch)

> Obscure C++ Error when Calling to_pandas on a RecordBatch
> ---------------------------------------------------------
>
>                 Key: ARROW-9818
>                 URL: https://issues.apache.org/jira/browse/ARROW-9818
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 1.0.0
>         Environment: AWS Lambda with pyarrow 1.0.0
>            Reporter: Nolo Ogbirner
>            Priority: Critical
>
> I'm using Pyarrow to stream a CSV from an input over HTTP and then converting 
> each RecordBatch to a Pandas DataFrame for manipulation. For testing, I'm 
> using the NYPD Motor Vehicle Collisions Open source dataset. However, for 
> anything above the 5MB file e.g. 1GB, 240MB, my code that is running in an 
> AWS Lambda is failing with a RuntimeError because of
> terminate called after throwing an instance of 'std::logic_error'
>  what(): basic_string::_S_construct null not valid
> after calling to_pandas() on the first batch. Why is this happening? How can 
> I fix it?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to