Ok, I am going to take the lack the answer as an admission that our analysis is correct.
Thank you Antoine. On Mon, 27 Sep 2021 11:51:07 +0200 Antoine Pitrou <anto...@python.org> wrote: > Hello, > > (sorry, this is a rehash of a question asked on > https://issues.apache.org/jira/browse/THRIFT-5237, since I haven't > received any reply there) > > In Apache Parquet, some of our users have encountered situations where > the Thrift 0.14 message size limitations would prevent from reading > legitimate real-world data (see > https://issues.apache.org/jira/browse/ARROW-13655 ). I have been > trying to understand what kind of vulnerability the new limitations are > designed to address, but have failed to find any precise analysis of > the issue. > > Therefore I have tried to go by the Thrift C++ library source code and > have come to the understanding that the vulnerability arises when using > one of the streaming transports where the encoded message size isn't > known in advance (such as socket-based). However, in Parquet C++ we read > the full message in one block from the underlying random access file, > and therefore it seems that disabling the max message size is > legitimate in our case. > > Is my understanding ok? If not, can somebody shed a bit more light on > what the vulnerability consists in? > > Regards > > Antoine. > > >