Hi Apache Arrow User,

I'm looking for some help to get started with Apache Arrow. We have a distributed system which has the following operation sequence:

Compute Node 1: Captures measurement data, 2x float arrays (32Bit with 150k values per array) which are transfered to a kafka cluster

Compute Node 2: Should listen on kafka for new data logs (the data logs are microbatches of measurment data), fetch data when received and then transfer it to another process/thread where the data is analysed.

The system is mainly written in C++ and parts of the final data analysis in c#. How could I effectivly utilize Arrow in such a scenario? Or should I use an alternative solutions? Are there any examples or code snippets which could help?

Thanks!

Reply via email to