Hey folks. I have been asked to share the latest flatbuffer prototype.
I will put the latest in this gist <https://gist.github.com/alkis/b2c78af23cb224671d7a8a77ac5f60b7> left with TODOs if folks want to collaborate. I am iterating in our internal C++ codebase, it would be nice if someone more knowledgeable with parquet-cpp can integrate this there so that we can do benchmarking/experimentation. Once setup I would be happy to contribute the scaffolding that converts from thrift to flatbuffers and take it from there. Other than the TODOs in the file, the following items are still missing: - optimize Statistics: this is by far the biggest payload - encryption is completely untouched/unthought - column indexes - bloom filters Some of the above might have to stay as is. The biggest blocker for me right now is collecting "interesting" footers from real tables (I very much dislike generated ones) and building a good repository with them to drive more design decisions.