Hi Gavin,

There was no detailed discussion in the meeting about this, just some
general comments, but I'll share a few areas of collaboration that I'm
aware of:
- There is work ongoing to enable the Arrow C++ compute engine (aka
"Acero") to consume Substrait plans, change them into ExecPlans, and
execute them. Work started on this late last year [1] and has
continued since then [2].
- There are plans to adopt Substrait in DataFusion [3] and Ballista [4]

There are also several other Sustrait-related projects not directly in
Arrow repos that engineers at Voltron Data are working on:
- Creating a Substrait compiler for Ibis [5], to allow Python users to
write code in a convenient analytics DSL and have it execute on
engines that can consume Substrait
- Creating a Substrait compiler for dplyr [6], to allow R users to
write dplyr code that can execute on engines that can consume
Substrait
- Creating a Substrait plan validator [7]
- Planning for "ADBC" to support Substrait [8]
- Defining more functions in the Substrait specification [9] <-- This
is an area where we could use more help

Thanks,
Ian

[1] https://github.com/apache/arrow/pull/11707
[2] 
https://github.com/apache/arrow/pulls?q=is%3Apr+substrait+label%3Alang-c%2B%2B
[3] https://github.com/apache/arrow-datafusion/issues/2646
[4] https://github.com/apache/arrow-ballista/issues/32
[5] https://github.com/ibis-project/ibis-substrait/
[6] https://github.com/voltrondata/substrait-r
[7] http://github.com/substrait-io/substrait-validator
[8] 
https://docs.google.com/document/d/1t7NrC76SyxL_OffATmjzZs2xcj1owdUsIF2WKL_Zw1U/
[9] https://github.com/substrait-io/substrait/tree/main/extensions



On Wed, Jun 8, 2022 at 5:41 PM Gavin Ray <ray.gavi...@gmail.com> wrote:
>
> Thanks Ian -- can I ask whether there was any discussion of note that
> happened around Arrow + Substrait stuff?
>
>
> On Wed, Jun 8, 2022 at 5:31 PM Ian Cook <i...@ursacomputing.com> wrote:
>
> > Attendees:
> >
> > Ian Cook
> > Raúl Cumplido
> > Alenka Frim
> > Ian Joiner
> > Will Jones
> > Jorge Leitão
> > David Li
> > Rok Mihevc
> > Ashish Paliwal
> > Matthew Topol
> > Jacob Wujciak
> >
> >
> > Discussion:
> >
> > Recent changes to the merge script for apache/arrow PRs
> > - Now uses a personal access token (PAT) to authenticate to the ASF Jira
> > - Now requires the GitHub PAT to have workflow scope
> > - See discussion about this on Zulip [1]
> >
> > Stabilizing the C Stream interface
> > - It has been 20 months since its introduction, with no changes
> > - See the ML discussion [2] about this
> > - Will Jones has put up two PRs [3][4] and started a vote [5] about
> > this on the mailing list
> >
> > Changes to release management guide
> > - Most of the content from the release management guide has been moved
> > [6] from Confluence [7] to the Arrow repo [8] where it is built as
> > part of the Arrow docs site [9]
> >
> > Proposed changes to release process
> > -  Raúl has proposed [10] a change to the release process to simplify
> > creation of release candidates and has opened a PR [11] to update the
> > release management guide to reflect this change
> >
> > Substrait project
> > - There is more collaboration happening between the Arrow and Substrait
> > projects
> > - There is a Substrait Community page [12] with details about how to
> > get involved in Substrait
> >
> > Proposal to Dockerize the integration tests:
> > - Jorge opened a PR proposing this [13] that Raúl and Jacob are reviewing
> >
> > [1]
> > https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/Merge.20script.20with.20API.20keys/near/285049925
> > [2] https://lists.apache.org/thread/0y604o9s3wkyty328wv8d21ol7s40q55
> > [3] https://github.com/apache/arrow/pull/13345
> > [4] https://github.com/apache/arrow-rs/pull/1821
> > [5] https://lists.apache.org/thread/5bvk6m3y3wl0m4jdsnyhdylt1w5j288k
> > [6] https://github.com/apache/arrow/pull/13272
> > [7]
> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide
> > [8]
> > https://github.com/apache/arrow/blob/master/docs/source/developers/release.rst
> > [9] https://arrow.apache.org/docs/dev/developers/release.html
> > [10] https://lists.apache.org/thread/g6mqpyq2hc11xbgrq2pf653njzy53plt
> > [11] https://github.com/apache/arrow/pull/13308
> > [12] https://substrait.io/community/
> > [13] https://github.com/apache/arrow/pull/12407
> >
> > On Wed, Jun 8, 2022 at 10:44 AM Ian Cook <i...@ursacomputing.com> wrote:
> > >
> > > Hi all,
> > >
> > > Our biweekly sync call is today at 12:00 noon Eastern time.
> > >
> > > The Zoom meeting URL for this and other biweekly Arrow sync calls is:
> > > https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09
> > >
> > > Alternatively, enter this information into the Zoom website or app to
> > > join the call:
> > > Meeting ID: 876 4903 3008
> > > Passcode: 958092
> > >
> > > Thanks,
> > > Ian
> >

Reply via email to