jorisvandenbossche commented on issue #34274: URL: https://github.com/apache/arrow/issues/34274#issuecomment-1447933460
> I used `dataframe.info()`, which said the memory usage is `10G+` If you have object dtype columns (for example string columns), this can be a large under-estimation. You can pass `dataframe.info(memory_usage="deep")` to get the full memory usage for the pandas.DataFrame. This can also be more than the memory usage you see for the pyarrow.Table (using `table.nbytes` as Weston mentioned above), since for some data types (such as strings), pandas is less efficient compared to pyarrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
