westonpace commented on issue #14606: URL: https://github.com/apache/arrow/issues/14606#issuecomment-1309172640
Memory problems are very difficult to debug and answer. There are a lot of factors involved. Can you perhaps create some kind of reproducible example? > Do we need to manually release the unused memory pool in default_shared_pool to keep memory efficient? No, you should not have to do this. Those capabilities are more for debugging and very corner case scenarios. They should not be needed for regular use. > However, we observe there are some native memory consumption even if there are no incoming data and the residual memory in native heap continue to grow (not seems to be leak) and sometimes can bring OOM issues when we are processing very large data What sorts of API calls are you making to do this conversion? Is this only repeated calls to `pyarrow.parquet.write_table`? How are you creating your tables / arrays? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
