Attendees:

- David Li
- Eduardo Ponce
- Gavin Ray
- Ian Cook
- James Duong
- Matthew Topol
- Nic
- Niranda
- Raul Cumplido
- Rok
- Weston Pace
- Will Jones

N.B. The Voltron Data folks have a scheduling conflict on 4/27 and will not be 
able to host the fortnightly sync call. Is anyone available to run the meeting 
that day?

Agenda:

8.0.0 Release: targeting 4/21, please try to get PRs wrapped up in the next 
~1-2 weeks. See the ML post [1] for details, including a wiki page listing 
outstanding issues. In particular, there are some Go PRs that could use 
attention from an interested Go developer [2], as well as some temporal kernel 
PRs that could use a review [3].

Arrow C++ Compute Engine: Weston gave a status update; APIs/documentation has 
been improved for users, though likely most will use it through an API like 
Substrait; basic Substrait support has been added with forthcoming 
improvements; more tooling to measure performance is being worked on; general 
kernel execution overhead is being addressed with an eye towards running 
smaller batches through the engine. An asof join implementation is being worked 
on, and Go is working towards Substrait bindings to be able to bind to the C++ 
engine.

Kernel vectorization/SIMD: Eduardo has been looking at making some of the 
primitive kernels (e.g. arithmetic) more easily autovectorized by the compiler, 
testing a variety of approaches. See related discussion [4]. We do not have 
benchmarks to evaluate compiler performance in this regard generally, but we 
have manually inspected some compiler output and found that not all compilers 
manage to do this with the current kernel implementations. We also don't have a 
holistic way to evaluate this going forward, nor do we have a sense for current 
benchmark coverage, though possibly we could generate benchmarks. However, it 
was pointed out that general engine performance is likely more important, and 
that current profiling indicates kernels are not yet a bottleneck, though there 
may be low-hanging fruit here.

Flight/Flight SQL: we discussed the barriers to Flight SQL support in Go; 
Flight SQL heavily uses union types which are not yet implemented. A further 
proposal [5] has been submitted to extend the type metadata, please take a look 
for those interested. The GetXdbcTypeInfo proposal was merged, and the inline 
data proposal is still outstanding (but probably ready to have a vote). 

IPC/Format: it was asked if there's an IPC structure for serializing a single 
array to reduce overhead. Current APIs likely suffice but Niranda may submit a 
separate discussion to explain further. 

[1]: https://lists.apache.org/thread/zk8hhynvy0bqvqpxk0868n5g0nmzbzbn
[2]: https://github.com/apache/arrow/pull/12158
[3]: https://github.com/apache/arrow/pull/12657
[4]: https://lists.apache.org/thread/8o7k4dt23chx3gn13rwkms38syyms489
[5]: https://lists.apache.org/thread/thvn89wg29gyctwycx2zjr4vvm2g80o6

On Tue, Apr 12, 2022, at 16:17, Ian Cook wrote:
> Hi all,
>
> Our biweekly sync call is tomorrow at 12:00 noon Eastern time.
>
> The Zoom meeting URL for this and other biweekly Arrow sync calls is:
> https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09
>
> Alternatively, enter this information into the Zoom website or app to
> join the call:
> Meeting ID: 876 4903 3008
> Passcode: 958092
>
> Thanks,
> Ian

Reply via email to