etseidl commented on PR #8636: URL: https://github.com/apache/arrow-rs/pull/8636#issuecomment-3417179496
> There is also a report that the C++ thrift generated code is faster than this parser -- https://lists.apache.org/thread/skr7f2tf94q59cx390cq2sw8f1nps675 > > I haven't been able to reproduce that result yet 🤔 So I did a quick sanity check on my workstation. ```c++ // 'buf' contains bytes for footer for (int i=0; i < 1000; i++) { std::shared_ptr<TMemoryBuffer> strBuf(new TMemoryBuffer(buf, ender.footer_len)); TCompactProtocol proto{strBuf}; parquet::format::FileMetaData fmd; fmd.read(&proto); } ``` vs ```rust // 'meta_data' contains bytes for footer for _ in 0..1000 { ParquetMetaDataReader::decode_metadata(&meta_data).unwrap(); } ``` c++ time: 64.004u 8.960s 1:13.06 99.8% 0+0k 0+0io 0pf+0w rust time: 26.714u 0.019s 0:26.77 99.8% 0+0k 0+0io 0pf+0w This is with thrift-cpp 0.23 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
