jhorstmann commented on issue #5775: URL: https://github.com/apache/arrow-rs/issues/5775#issuecomment-2131307588
I had an attack of "not invented here" syndrome the last few days 😅 and worked on an alternative code generator for thrift, that would allow me to more easily try out some changes to the generated code. The repo can be found at <https://github.com/jhorstmann/compact-thrift/> and the output for `parquet.thrift` at <https://github.com/jhorstmann/compact-thrift/blob/main/src/main/rust/tests/parquet.rs>. The current output is still doing allocations for string and binary, but running the benchmarks from <https://github.com/tustvold/arrow-rs/tree/thrift-bench> shows some nice improvements. This is the comparison with current arrow-rs code, so both versions should be doing the same amount of allocations: ``` decode metadata time: [32.592 ms 32.645 ms 32.702 ms] decode metadata new time: [17.440 ms 17.476 ms 17.532 ms] ``` So incidentally very close to that 2x improvement. The main difference in the code should be avoiding most of the abstractions from `TInputProtocol` and avoiding stack moves by directly writing into default-initialized structs instead of moving from local variables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
