emkornfield commented on issue #14229: URL: https://github.com/apache/arrow/issues/14229#issuecomment-1282747916
> I don't fully understand how the dataframe is involved here. If I read the above correctly, it is the reading of a Parquet file into an Arrow table that is failing? (and not the conversion of the dataframe -> pyarrow table (for writing), or after reading the conversion of pyarrow table -> dataframe) This is my understanding as well. > When converting a large dataframe like this, I think we automatically use chunked array to be able to represent this in a ListType? But when reading from Parquet, I would assume we also use chunks per record batch? Yes, I wasn't thinking clearly. One possible conclusion is we aren't do chunking when reading from parquet->arrow->pandas? Is that possible? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
