I just wanted to introduce myself to the group before I start asking lots
of questions. I'm a software engineer mostly working with
Scala/Spark/Kudu/Parquet in my day job and in my spare time I have been
working on a POC of a distributed data platform implemented in Rust. The
project is called DataFusion (https://www.datafusion.rs/).

The project is very early and the implementation is currently very simple
row-based processing but the performance is already quite exciting to me
(current test case is 4x faster than Apache Spark).

I have decided that I should now concentrate on making Apache Arrow the
native memory format so that I can implement more efficient data processing
and make it easier in the future to be able to integrate with things like
Kudu and Parquet. It's also just a great way for me to learn about

I'm just in the process of getting Arrow compiling and reading the docs.
I'll be back soon with questions I'm sure.



Reply via email to