oliviermeslin commented on issue #37655: URL: https://github.com/apache/arrow/issues/37655#issuecomment-1754411202
@vkhodygo : I'm not sure what you mean by "improving this further". - If you mean that the example I provide is very rough and could be improved, you are very right, I just try to find an easy example to reproduce the bug. - If you mean that working on data types could be a solution, I'm really not sure: it would clearly help in some cases, but you would still have the same problem if the datasets to be merged are large enough. For instance, in my daily use case, I have to merge datasets with 20M-50M observations and 100+ columns, so the 4GB is very much binding, even after improving data types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
