Hello,

On Mon, 13 Sep 2021 16:08:19 +0200
Antoine Pitrou <[email protected]> wrote:
> 
> My initial fix is to simply remove the limitation.  That is based on
> the interpretation that the "message size" is simply the encoded size
> of a Thrift payload.  Since we load the Thrift message entirely in
> memory from the Parquet file, based on what the Parquet metadata says,
> the fact that another size is recorded in the Thrift message shouldn't
> ideally be a problem.  But of course, that feels a bit unsatisfactory
> (I cannot say for sure whether a problem exists or not).
> https://github.com/apache/arrow/pull/11123

I'm following up now that I've read through the relevant Thrift C++
transport implementations.  I'm reasonably convinced that my analysis
is correct, as the max message size applies to encoded Thrift bytes,
and we already know the encoded.  I still hope to receive an answer
from the Thrift community on
https://issues.apache.org/jira/browse/THRIFT-5237.

Did nobody experience this issue with other Parquet implementations?

Regards

Antoine.


Reply via email to