Hello,
On Mon, 13 Sep 2021 16:08:19 +0200 Antoine Pitrou <[email protected]> wrote: > > My initial fix is to simply remove the limitation. That is based on > the interpretation that the "message size" is simply the encoded size > of a Thrift payload. Since we load the Thrift message entirely in > memory from the Parquet file, based on what the Parquet metadata says, > the fact that another size is recorded in the Thrift message shouldn't > ideally be a problem. But of course, that feels a bit unsatisfactory > (I cannot say for sure whether a problem exists or not). > https://github.com/apache/arrow/pull/11123 I'm following up now that I've read through the relevant Thrift C++ transport implementations. I'm reasonably convinced that my analysis is correct, as the max message size applies to encoded Thrift bytes, and we already know the encoded. I still hope to receive an answer from the Thrift community on https://issues.apache.org/jira/browse/THRIFT-5237. Did nobody experience this issue with other Parquet implementations? Regards Antoine.
