felipecrv commented on issue #39682:
URL: https://github.com/apache/arrow/issues/39682#issuecomment-1932870929

   @mapleFU I understand the error is raised from the `BaseBinaryBuilder<T>` 
(superclass of the `StringBuilder`). The issue is not the inability to allocate 
more than 2GBs of RAM, the issue is that the `StringArray` can't address more 
than 2GBs of RAM from the offsets buffer (32-bit offsets). The Parquet reader 
should figure how to read these string into a `LargeStringArray`. That means 
writing 64-bit offsets into the data buffer of the resulting `LargeStringArray`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to