mapleFU commented on issue #36940:
URL: https://github.com/apache/arrow/issues/36940#issuecomment-1657232169
Emm the file is too small, I guess the code below is where the exception
thrown. Later you can read the metadata with larger container size or string
size in python or c++ to confirm if the containersize is large enough
```c++
template <class T>
void DeserializeUnencryptedMessage(const uint8_t* buf, uint32_t* len,
T* deserialized_msg) {
// Deserialize msg bytes into c++ thrift msg using memory transport.
auto tmem_transport =
CreateReadOnlyMemoryBuffer(const_cast<uint8_t*>(buf), *len);
apache::thrift::protocol::TCompactProtocolFactoryT<ThriftBuffer>
tproto_factory;
// Protect against CPU and memory bombs
tproto_factory.setStringSizeLimit(string_size_limit_);
tproto_factory.setContainerSizeLimit(container_size_limit_);
auto tproto = tproto_factory.getProtocol(tmem_transport);
try {
deserialized_msg->read(tproto.get());
} catch (std::exception& e) {
std::stringstream ss;
ss << "Couldn't deserialize thrift: " << e.what() << "\n";
throw ParquetException(ss.str());
}
uint32_t bytes_left = tmem_transport->available_read();
*len = *len - bytes_left;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]