hi Andy — my team at Ursa Computing is planning to contribute here significantly in 2021 and beyond (we're also hiring for people to work on this), but we haven't put forward any new documents beyond the ones I wrote in the past AFAIK. We've been working on getting our general systems affairs in order in C++ (see the recent async, multithreading work) to provide a solid foundation for the buildout. The project in general has made a lot of progress on basic functions / expressions since the big refactor May/June last year. I think it's a good idea to have high quality implementations of canonical algorithms (hash aggregation, joins, sorting, etc.) so they can be community-maintained and -optimized and reused in other projects.
Some extra-ASF data processing projects that are written in C++ and using Arrow to represent data: * Cylon (@ multiple unis) https://github.com/cylondata/cylon * Hustle (@ U Wisconsin) https://github.com/UWHustle/hustle * NoisePage (@ CMU) https://github.com/cmu-db/noisepage * RAPIDS / cuDF (https://github.com/rapidsai/cudf) There's probably some others that I'm not aware of. best, Wes On Sun, Feb 21, 2021 at 4:22 PM Andy Grove <andygrov...@gmail.com> wrote: > > I'm giving a talk about Ballista [1], Rust, and Apache Arrow this week [2] > and I'd like to make sure I'm giving accurate information about other > efforts around building execution engines. > > I know there is work happening in C++ but I haven't been following this. > Could someone point me to any documentation/news about this or maybe > provide a brief summary of the current status? > > Are there other projects that I should be aware of and mention in this talk? > > Thanks, > > Andy. > > [1] https://github.com/ballista-compute/ballista > [2] https://www.meetup.com/nyhackr/events/276261812/