Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

2024-05-07 Thread Kevin Gurney
Congratulations, Dane!

From: Dewey Dunnington 
Sent: Tuesday, May 7, 2024 12:18 PM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

Congrats!

On Tue, May 7, 2024 at 11:55 AM Raúl Cumplido  wrote:
>
> Congratulations Dane!
>
> El mar, 7 may 2024, 16:32, Weston Pace  escribió:
>
> > Congrats Dane!
> >
> > On Tue, May 7, 2024, 7:30 AM Nic Crane  wrote:
> >
> > > Congrats Dane, well deserved!
> > >
> > > On Tue, 7 May 2024 at 15:16, Gang Wu  wrote:
> > > >
> > > > Congratulations Dane!
> > > >
> > > > Best,
> > > > Gang
> > > >
> > > > On Tue, May 7, 2024 at 10:12 PM Ian Cook  wrote:
> > > >
> > > > > Congratulations Dane!
> > > > >
> > > > > On Tue, May 7, 2024 at 10:10 AM Alenka Frim  > > > > .invalid>
> > > > > wrote:
> > > > >
> > > > > > Yay, congratulations Dane!!
> > > > > >
> > > > > > On Tue, May 7, 2024 at 4:00 PM Rok Mihevc 
> > > wrote:
> > > > > >
> > > > > > > Congrats Dane!
> > > > > > >
> > > > > > > Rok
> > > > > > >
> > > > > > > On Tue, May 7, 2024 at 3:57 PM wish maple <
> > maplewish...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congrats!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Xuwei Fu
> > > > > > > >
> > > > > > > > Joris Van den Bossche 
> > > 于2024年5月7日周二
> > > > > > > 21:53写道:
> > > > > > > >
> > > > > > > > > On behalf of the Arrow PMC, I'm happy to announce that Dane
> > > Pitkin
> > > > > > has
> > > > > > > > > accepted an invitation to become a committer on Apache Arrow.
> > > > > > Welcome,
> > > > > > > > > and thank you for your contributions!
> > > > > > > > >
> > > > > > > > > Joris
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >



Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

2024-04-11 Thread Kevin Gurney
Congratulations, Sarah!! Well deserved!

From: Jacob Wujciak 
Sent: Thursday, April 11, 2024 11:14 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

Congratulations and welcome!

Am Do., 11. Apr. 2024 um 17:11 Uhr schrieb Raúl Cumplido :

> Congratulations Sarah!
>
> El jue, 11 abr 2024 a las 13:13, Sutou Kouhei ()
> escribió:
> >
> > Hi,
> >
> > On behalf of the Arrow PMC, I'm happy to announce that Sarah
> > Gilmore has accepted an invitation to become a committer on
> > Apache Arrow. Welcome, and thank you for your contributions!
> >
> > Thanks,
> > --
> > kou
>


Re: [ANNOUNCE] New Committer Joel Lubinitsky

2024-04-01 Thread Kevin Gurney
Congratulations, Joel!


From: Jason Z 
Sent: Monday, April 1, 2024 11:13 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Committer Joel Lubinitsky

Congrats Joel!


Thanks,
Jiashen


On Mon, Apr 1, 2024 at 8:10 AM Ian Cook  wrote:

> Congratulations Joel!
>
> On Mon, Apr 1, 2024 at 11:08 AM wish maple  wrote:
>
> > Congrats Joel!
> >
> > Best,
> > Xuwei Fu
> >
> > Matt Topol  于2024年4月1日周一 22:59写道:
> >
> > > On behalf of the Arrow PMC, I'm happy to announce that Joel Lubinitsky
> > has
> > > accepted an invitation to become a committer on Apache Arrow. Welcome,
> > and
> > > thank you for your contributions!
> > >
> > > --Matt
> > >
> >
>


Re: [ANNOUNCE] New Arrow committer: Bryce Mecum

2024-03-19 Thread Kevin Gurney
Congratulations, Bryce!


From: Dewey Dunnington 
Sent: Tuesday, March 19, 2024 9:50 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Bryce Mecum

Congratulations Bryce! And thank you!

On Mon, Mar 18, 2024 at 2:16 PM Wes McKinney  wrote:
>
> Congrats!
>
> On Mon, Mar 18, 2024 at 12:15 PM James Duong
>  wrote:
>
> > Congratulations Bryce!
> >
> > From: Dane Pitkin 
> > Date: Monday, March 18, 2024 at 7:28 AM
> > To: dev@arrow.apache.org 
> > Subject: Re: [ANNOUNCE] New Arrow committer: Bryce Mecum
> > Congratulations, Bryce!!
> >
> > On Mon, Mar 18, 2024 at 9:18 AM David Li  wrote:
> >
> > > Congrats Bryce!
> > >
> > > On Mon, Mar 18, 2024, at 08:52, Ian Cook wrote:
> > > > Congratulations Bryce!
> > > >
> > > > Ian
> > > >
> > > > On Sun, Mar 17, 2024 at 22:24 Nic Crane  wrote:
> > > >
> > > >> On behalf of the Arrow PMC, I'm happy to announce that Bryce Mecum has
> > > >> accepted an invitation to become a committer on Apache Arrow. Welcome,
> > > and
> > > >> thank you for your contributions!
> > > >>
> > > >> Nic
> > > >>
> > >
> >



Re: [ANNOUNCE] New Arrow committer: Jeffrey Vo

2024-02-12 Thread Kevin Gurney
Congratulations, Jeffrey!

From: Alenka Frim 
Sent: Monday, February 12, 2024 2:04 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Jeffrey Vo

Congratulations Jeffrey!

On Tue, Feb 6, 2024 at 7:30 PM Raphael Taylor-Davies
 wrote:

> On behalf of the Arrow PMC, I am happy to announce that Jeffrey Vo has
> accepted an invitation to become a committer on Apache Arrow. Welcome,
> and thank you for your contributions!
>
> Raphael Taylor-Davies
>
>


Re: [ANNOUNCE] New Arrow committer: Felipe Oliveira Carvalho

2023-12-07 Thread Kevin Gurney
Congratulations, Felipe!

From: Daniël Heres 
Sent: Thursday, December 7, 2023 2:59 PM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Felipe Oliveira Carvalho

Congrats!

Op do 7 dec 2023 om 20:52 schreef Ben Harkins :

> Congrats, Felipe!
>
> On Thu, Dec 7, 2023 at 2:00 PM Vibhatha Abeykoon 
> wrote:
>
> > Congratulations Felipe.
> >
> > Vibhatha Abeykoon
> >
> >
> > On Fri, Dec 8, 2023 at 12:25 AM David Li  wrote:
> >
> > > Congrats Felipe!
> > >
> > > On Thu, Dec 7, 2023, at 13:02, Raúl Cumplido wrote:
> > > > Congratulations Felipe!
> > > >
> > > > El jue, 7 dic 2023, 18:02, Dane Pitkin  >
> > > > escribió:
> > > >
> > > >> Congrats, Felipe!
> > > >>
> > > >> On Thu, Dec 7, 2023 at 11:41 AM hsseo0501 
> > wrote:
> > > >>
> > > >> > Congrats. Felipe :)내 Galaxy에서 보냄
> > > >> >  원본 이메일 발신: Ian Cook  날짜:
> > > 23/12/8
> > > >> > 오전 1:24  (GMT+09:00) 받은 사람: dev@arrow.apache.org 제목: Re:
> [ANNOUNCE]
> > > New
> > > >> > Arrow committer: Felipe Oliveira Carvalho Congratulations
> > Felipe!!!On
> > > >> Thu,
> > > >> > Dec 7, 2023 at 10:43 AM Benjamin Kietzman 
> > > wrote:>>
> > > >> > On behalf of the Arrow PMC, I'm happy to announce that Felipe
> > > Oliveira>
> > > >> > Carvalho> has accepted an invitation to become a committer on
> > Apache>
> > > >> > Arrow. Welcome, and thank you for your contributions!>> Ben
> Kietzman
> > > >>
> > >
> >
>


--
Daniël Heres


Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove

2023-11-27 Thread Kevin Gurney
Congratulations, Andy!

From: Raúl Cumplido 
Sent: Monday, November 27, 2023 8:58 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove

Congratulations Andy and thanks for the effort during last year Andrew!

El lun, 27 nov 2023 a las 14:54, David Li () escribió:
>
> Congrats Andy!
>
> On Mon, Nov 27, 2023, at 08:02, Mehmet Ozan Kabak wrote:
> > Congratulations Andy. I am sure we will keep building great tech this
> > year, just like last year, under your watch.
> >
> > Mehmet Ozan Kabak
> >
> >
> >> On Nov 27, 2023, at 3:47 PM, Daniël Heres  wrote:
> >>
> >> Congrats Andy!
> >>
> >> Op ma 27 nov 2023 om 13:47 schreef Andrew Lamb :
> >>
> >>> I am pleased to announce that the Arrow Project has a new PMC chair and VP
> >>> as per our tradition of rotating the chair once a year. I have resigned 
> >>> and
> >>> Andy Grove was duly elected by the PMC and approved unanimously by the
> >>> board.
> >>>
> >>> Please join me in congratulating Andy Grove!
> >>>
> >>> Thanks,
> >>> Andrew
> >>>
> >>
> >>
> >> --
> >> Daniël Heres



Re: [ANNOUNCE] New Arrow committer: James Duong

2023-11-16 Thread Kevin Gurney
Congratulations, James!

From: David Li 
Sent: Thursday, November 16, 2023 9:30 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: James Duong

Congrats James!

On Thu, Nov 16, 2023, at 09:14, Dane Pitkin wrote:
> Congrats, James!
>
> On Thu, Nov 16, 2023 at 4:23 AM Alenka Frim 
> wrote:
>
>> Congratulations!
>>
>> On Thu, Nov 16, 2023 at 8:46 AM Joris Van den Bossche <
>> jorisvandenboss...@gmail.com> wrote:
>>
>> > Congrats!
>> >
>> > On Thu, 16 Nov 2023 at 08:44, Sutou Kouhei  wrote:
>> > >
>> > > On behalf of the Arrow PMC, I'm happy to announce that James Duong
>> > > has accepted an invitation to become a committer on Apache
>> > > Arrow. Welcome, and thank you for your contributions!
>> > >
>> > > --
>> > > kou
>> > >
>> > >
>> >
>>



Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Kevin Gurney
Congratulations, Raúl!


From: Nic Crane 
Sent: Monday, November 13, 2023 2:31 PM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

Congrats Raul!

On Tue, 14 Nov 2023, 03:28 Andrew Lamb,  wrote:

> The Project Management Committee (PMC) for Apache Arrow has invited
> Raúl Cumplido  to become a PMC member and we are pleased to announce
> that  Raúl Cumplido has accepted.
>
> Please join me in congratulating them.
>
> Andrew
>


Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB interface

2023-11-08 Thread Kevin Gurney
Hi Kou and Dewey,

Thank you very much for your very thorough and detailed responses to all of our 
questions. This is extremely valuable feedback and the points that you made 
make alot of sense.

Sarah and I talked this over a bit more and we think that sticking with the 
overall apache/arrow project release cycle (i.e. stay in line with 15.0.0) 
makes the most sense in the long term.

@Dewey - thanks very much for highlighting the pros and cons of creating a 
separate repository. We also really appreciate the community being willing to 
try and support our development needs. That being said, we think it is probably 
best to stay in-model with the main apache/arrow release process for the time 
being rather than creating a separate repository for the MATLAB interface.

To address some related points and questions:

> Can we just mention "This is not stable yet!!!" in the documentation instead 
> of using isolated version?

Yes. This is good point and we already have a disclaimer in the README.md [1] 
for the MATLAB interface which says: "Warning The MATLAB interface is under 
active development and should be considered experimental."

> It's better that we use CI for this like other binary packages such as 
> .deb/.rpm/.wheel/.jar/...

This makes sense and we agree. We will follow up with PRs to add the necessary 
MATLAB packaging scripts and CI workflow files.

> Does the MLTBX file include Apache Arrow C++ binaries too like .wheel/.jar?

Yes. The MLTBX file will package the Apache Arrow C++ binaries, similar to the 
Java JARs / Python wheels.

> MATLAB doesn't provide the official package repository such as PyPI for 
> Python and https://rubygems.org/ for Ruby, right?

The equivalent to pypi.org or rubygems.org for MATLAB would be the MathWorks 
File Exchange [2].

> If the official package repository for MATLAB doesn't exist, JFrog is better 
> because the MLTBX file will be large (Apache Arrow C++ binaries are large).

As noted above, the "official package repository" for MATLAB would be the 
MathWorks File Exchange. File Exchange has tight integration with GitHub [3]. 
When a new release is available in GitHub Releases, the associated File 
Exchange entry will be automatically updated.

We believe we could leverage this integration between File Exchange and GitHub 
Releases to automate the MATLAB interface release process. This approach might 
look like:

1. Upload MLTBX to JFrog Artifactory
2. Run a post release script that would:
2.1 Download MLTBX from JFrog Artifactory
2.2 Upload to GitHub Releases (e.g. apache/arrow-matlab - see discussion below)
2.3 Linked File Exchange entry will be automatically updated

One open question about this approach: which GitHub repository should we use 
for hosting the MLTBX via GitHub Releases?

We don't think using the main apache/arrow GitHub Releases area is the right 
approach. So, would it make sense to create a separate "bridge" repository just 
for hosting the latest MLTBX files? Should this be an ASF associated repository 
like apache/arrow-matlab or would a MathWorks associated repository like 
mathworks/arrow-matlab be OK? We aren't sure what makes the most sense here, 
but welcome any suggestions.

> We may want to use the status page for it: 
> https://arrow.apache.org/docs/status.html

Thanks for highlighting this. This makes sense, and we can follow up with a PR 
to add MATLAB to the status page.

> How about creating https://arrow.apache.org/docs/matlab/ ? We can use Sphinx 
> like the Python docs https://arrow.apache.org/docs/python/ or another 
> documentation tools like the R docs https://arrow.apache.org/docs/r/ . If we 
> use Sphinx, we can create 
> https://github.com/apache/arrow/tree/main/docs/source/matlab/

This makes sense and eventually we want to have comprehensive documentation in 
line with other language bindings using Sphinx. In addition to comprehensive 
documentation, we were also hoping that we could host release notes in a place 
that is easily accessible from the MLTBX download location. File Exchange 
entries have a "Version History" which includes release notes from the 
"backing" GitHub Releases area. So, this would probably be a sensible location 
to put the release notes. Also, including MATLAB updates in Apache Arrow 
release blog posts (e.g. 
https://arrow.apache.org/blog/2023/11/01/14.0.0-release/) may also be helpful.

--

We really appreciate all of the community's guidance on navigating the release 
process!

We will get started on integrating with the existing release tooling.

[1] https://github.com/apache/arrow/tree/main/matlab#status
[2] https://www.mathworks.com/matlabcentral/fileexchange
[3] https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub

Best Regards,

Kevin Gurney

From: Dewey Dunnington 
Sent: Tuesday, November 7, 2023 8:53 PM
To: dev@

[DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB interface

2023-11-07 Thread Kevin Gurney
Hi All,

A considerable amount of new functionality has been added to the MATLAB 
interface over the last few months. We appreciate all the community's support 
in making this possible and are happy to see all the progress that is being 
made.

At this point, we would like to create an initial "0.1" release of the MATLAB 
interface. Incremental point releases will enable MATLAB users to provide early 
feedback. In addition, learning how to navigate the release process is an 
important step towards eventually releasing a stable 1.0 version of the MATLAB 
interface.

Our proposed approach to creating an initial release would be to:

1. Manually build the MATLAB interface on Windows, macOS, and Linux
2. Combine all of the cross platform build artifacts into a single MLTBX file 
[1] for distribution
3. Host the MLTBX somewhere that is easliy accessible for download

For reference - MLTBX is a standard packaging format for MATLAB which enables 
simple "one-click" installation - analogous to a Python pip package or a Ruby 
gem.

Creating an MLTBX file manually should be relatively low effort. However, in 
the long term, we would love to enable semi-automated "push button" releases 
via GitHub Actions (and possibly even "nightly builds").

Since this is our first time creating a release of the MATLAB interface, we 
wanted to draw on the community's expertise to answer a few questions:

1. Is there a recommended location where we can host the MLTBX file? e.g. 
GitHub Releases [2], JFrog [3], etc.?
2. Is there a recommended location for hosting release notes?
3. Is there a recommended cadence for incremental point releases?
4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new release 
proposal) that we should be aware of as we consider creating an initial release?
5. How should the Arrow project release (i.e. 14.0.0) relate to the MATLAB 
interface version (i.e. 0.1)? As a point of reference, we noticed that PyArrow 
is on version 14.0.0, but it feels "misleading" to say that the MATLAB 
interface is at version 14.0.0 when we haven't yet implemented or stabilized 
all core Arrow APIs. Is there any precedent for using independent release 
versions for language bindings which are not fully stabilized and are also part 
of the main apache/arrow repository?

We've noticed that Arrow-related projects which are not part of the main 
apache/arrow GitHub repository (e.g. DataFusion) follow a mailing list-based 
voting and release process. However, it's not clear whether it makes sense to 
follow this process for the MATLAB interface since it is part of the main 
apache/arrow repository.

We sincerely appreciate the community's help and guidance on this topic!

Please let us know if you have any questions.

[1] https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
[2] https://github.com/apache/arrow/releases
[3] https://apache.jfrog.io/ui/native/arrow/
[4] https://www.apache.org/foundation/voting.html
[5] https://www.apache.org/legal/release-policy.html#release-approval

Best Regards,

Kevin Gurney


Re: Request for Assistance with MATLAB CI Workflow Issue

2023-10-24 Thread Kevin Gurney
Hi Divyansh,

As Dane pointed out, I did leave some code review comments earlier today on the 
PR that may provide some guidance on how to address the CI failures.

If you have any specific questions about any of the code review feedback, 
please feel free to continue the discussion on the PR, and I will be happy to 
provide further assistance.

Best Regards,

Kevin Gurney


From: Dane Pitkin 
Sent: Tuesday, October 24, 2023 3:49 PM
To: dev@arrow.apache.org 
Subject: Re: Request for Assistance with MATLAB CI Workflow Issue

Hey Divyansh,

I suggest discussing this on the PR[1] itself. You will get the best
discussion there, where comments can be added to the code review directly.
It looks like there is a MATLAB reviewer that has already left feedback
that may be helpful.

[1]https://github.com/apache/arrow/pull/38274<https://github.com/apache/arrow/pull/38274>

On Tue, Oct 24, 2023 at 2:28 PM Divyansh Khatri 
wrote:

> I am currently working on GitHub Issue #38211, titled "[MATLAB] Add support
> for creating an empty arrow.tabular.RecordBatch by calling
> arrow.recordBatch with no input arguments." As a beginner in MATLAB
> development, I am facing challenges with the failing MATLAB CI workflow
> associated with this issue. We need to ensure that all the appropriate test
> cases pass as expected.
> If you have any insights or suggestions on how to fix this issue, please
> share them with me.
>
> -Divyansh
>


Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-23 Thread Kevin Gurney
Congratulations, Xuwei!

From: Jacob Wujciak-Jens 
Sent: Monday, October 23, 2023 11:13 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

Congrats and welcome!

On Mon, Oct 23, 2023 at 5:02 PM Ian Cook  wrote:

> Congratulations Xuwei!
>
> On Mon, Oct 23, 2023 at 12:46 AM Sutou Kouhei  wrote:
> >
> > On behalf of the Arrow PMC, I'm happy to announce that Xuwei Fu
> > has accepted an invitation to become a committer on Apache
> > Arrow. Welcome, and thank you for your contributions!
> >
> > --
> > kou
>


Re: [ANNOUNCE] New Arrow committer: Curt Hagenlocher

2023-10-16 Thread Kevin Gurney
Congratulations, Curt!

From: Weston Pace 
Sent: Sunday, October 15, 2023 5:32 PM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Curt Hagenlocher

Congratulations!

On Sun, Oct 15, 2023, 8:51 AM Gang Wu  wrote:

> Congrats!
>
> On Sun, Oct 15, 2023 at 10:49 PM David Li  wrote:
>
> > Congrats & welcome Curt!
> >
> > On Sun, Oct 15, 2023, at 09:03, wish maple wrote:
> > > Congratulations!
> > >
> > > Raúl Cumplido  于2023年10月15日周日 20:48写道:
> > >
> > >> Congratulations and welcome!
> > >>
> > >> El dom, 15 oct 2023, 13:57, Ian Cook  escribió:
> > >>
> > >> > Congratulations Curt!
> > >> >
> > >> > On Sun, Oct 15, 2023 at 05:32 Andrew Lamb 
> > wrote:
> > >> >
> > >> > > On behalf of the Arrow PMC, I'm happy to announce that Curt
> > Hagenlocher
> > >> > > has accepted an invitation to become a committer on Apache
> > >> > > Arrow. Welcome, and thank you for your contributions!
> > >> > >
> > >> > > Andrew
> > >> > >
> > >> >
> > >>
> >
>


Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-16 Thread Kevin Gurney
Congratulations, Jonathan!


From: Dane Pitkin 
Sent: Monday, October 16, 2023 11:52 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

Congrats Jon!!

On Mon, Oct 16, 2023 at 7:04 AM Krisztián Szűcs 
wrote:

> Congrats Jon!
>
> On Mon, Oct 16, 2023 at 11:20 AM Alenka Frim
>  wrote:
> >
> > Yay, congratulations Jon!!
> >
> > On Mon, Oct 16, 2023 at 10:27 AM vin jake  wrote:
> >
> > > Congrats Jon!
> > >
> > > On Sun, Oct 15, 2023 at 1:25 AM Andrew Lamb 
> wrote:
> > >
> > > > The Project Management Committee (PMC) for Apache Arrow has invited
> > > > Jonathan Keane to become a PMC member and we are pleased to announce
> > > > that Jonathan Keane has accepted.
> > > >
> > > > Congratulations and welcome!
> > > >
> > > > Andrew
> > > >
> > >
>


Re: [CROWDSOURCING] 2023 ASF Board Report -- October 11, 2023

2023-10-05 Thread Kevin Gurney
Hi All,

@Andrew - thanks for organizing this!
@Kou - thank you for adding notes about the MATLAB bindings!

Sarah (Cc'd) and I added a few more details about progress on the MATLAB 
bindings.

Best Regards,

Kevin Gurney


From: Sutou Kouhei 
Sent: Wednesday, October 4, 2023 5:49 PM
To: dev@arrow.apache.org
Subject: Re: [CROWDSOURCING] 2023 ASF Board Report -- October 11, 2023

Hi Andrew,

Thanks for preparing this! I've added some comments for
Flight RPC, Apache Arrow Flight SQL adapter for PostgreSQL,
C GLib, MATLAB, Ruby, Swift and so on.


Thanks,
--
ko

In 
"[CROWDSOURCING] 2023 ASF Board Report -- October 11, 2023" on Thu, 28 Sep 2023 
06:33:22 -0400,
Andrew Lamb  wrote:

> Hello Arrow Community,
>
> Please add any comments or board content directly to [1] or reply to
> this email and I will incorporate your comments. You can see what we
> currently have at the end of this email.
>
> One of the responsibilities of being part of the Apache Software Foundation
> (ASF) is to regularly summarize the state of the project in a quarterly
> update to the ASF board. I plan to submit the next report on October 14,
> 2023
>
> While this is partly an administrative reporting exercise, I think it is
> also valuable to reflect on past accomplishments and think about goals for
> the future.
>
> Historically, Arrow has crowd sourced the content which has worked well.
> It would be especially interesting and valuable for members of the various
> language
> implementation communities and subprojects could provide a sentence or two
> updates
>
> Thank you,
> Andrew
>
> [1]:
> https://docs.google.com/document/d/1MU5cxzVuAIuDb6KXOAkwT4ze7IBGHKks_l92gxZeTbg/edit<https://docs.google.com/document/d/1MU5cxzVuAIuDb6KXOAkwT4ze7IBGHKks_l92gxZeTbg/edit>
>
> --
>
> Current content
>
> ## Description:
> The mission of Apache Arrow is the creation and maintenance of software
> related to columnar in-memory processing and data interchange. More
> information can be found at 
> https://arrow.apache.org/overview/<https://arrow.apache.org/overview>
>
> ## Project Status:
>
> Current project status: Ongoing (high activity)
>
> Issues for the board: None
>
> ## Membership Data:
>
> Apache Arrow was founded 2016-01-19 (8 years ago)
> There are currently 98 committers and 50 PMC members in this project.
> The Committer-to-PMC ratio is roughly 7:4.
>
> Community changes, past quarter:
> - No new PMC members. Last addition was Dewey Dunnington on 2023-06-22.
> - Kevin Gurney was added as committer on 2023-07-04
> - Metehan Yildirim was added as committer on 2023-08-29
>
>
> ## Project Activity:
>
> We added a new array layout, [Utf8View] to the Arrow spec, which allows
> more efficient variable length string handling.
>
>
>
> [Utf8View]: 
> https://lists.apache.org/thread/wt9j3q7qd59cz44kyh1zkts8s6wo1dn6<https://lists.apache.org/thread/wt9j3q7qd59cz44kyh1zkts8s6wo1dn6>
>
>
> ## Sub Project Updates
> Arrow has several subprojects, as listed on 
> https://arrow.apache.org/<https://arrow.apache.org>
>
> ### ADBC
>
>
>
>
> ### Arrow Flight
>
>
>
>
> ### Arrow Flight SQL
>
>
>
>
> ### DataFusion
>
>
>
> ### Acero
>
>
>
> ## Language Area Updates
>
> Arrow has at least 12 different language implementations, as explained in
> https://arrow.apache.org/overview/<https://arrow.apache.org/overview/>
>
> Arrow 12.0.0 was released from the monorepo:
> https://arrow.apache.org/blog/2023/05/02/12.0.0-release/<https://arrow.apache.org/blog/2023/05/02/12.0.0-release>
>
> ### C++
>
>
>
> ### C#
>
>
>
> ### Go
>
>
>
> ### Java
>
>
> ### JavaScript
>
> ### Julia
>
>
>
> ### nanoarrow
>
>
>
> ### Rust
>
> The Rust implementation is in the process of adding StringView.
>
> ### C (GLib)
>
>
> ### MATLAB
>
>
>
> ### Python
>
>
>
> ### R
>
>
>
> ### Ruby
>
>
> ### Swift
>
>
> ## Community Health:
> Community communication continues to be strong.
>
> There have been 5 blog posts published to 
> https://arrow.apache.org/blog/<https://arrow.apache.org/blog> in
> the last 3 months.
>
> The mailing lists are active
>
> * dev@arrow.apache.org had a 29% decrease in traffic in the past quarter
> (568 emails compared to 789)
> * u...@arrow.apache.org had a 52% decrease in traffic in the past quarter
> (65 emails compared to 133)
>
>
> For the mono repo:
>
> * 2328 commits in the past quarter (2% increase)
> * 246 code contributors in the past quarter (-3% change)
> * 1802 PRs opened on GitHub, past quarter (-13% change)
> * 1740 PRs closed on GitHub, past quarter (-14% change)
> * 1515 issues opened on GitHub, past quarter (-10% change)
> * 1234 issues closed on GitHub, past quarter (-12% change)




Re: [ACCOUNCE] New Arrow Committer: Metehan Yildirim

2023-09-06 Thread Kevin Gurney
Congratulations Metehan!

From: Mustafa Akur 
Sent: Wednesday, September 6, 2023 1:56 AM
To: dev@arrow.apache.org 
Subject: Re: [ACCOUNCE] New Arrow Committer: Metehan Yildirim

Congrats Mete!

On Wed, Sep 6, 2023 at 7:19 AM Alenka Frim 
wrote:

> Congratulations Metehan!
>
> On Wed, Sep 6, 2023 at 2:21 AM Ian Joiner  wrote:
>
> > Congratulations!
> >
> > On Tue, Sep 5, 2023 at 12:14 PM Andrew Lamb 
> wrote:
> >
> > > Belatedly,
> > >
> > > On behalf of the Arrow PMC, I'm happy to announce that Metehan Yildirim
> > > (mete[1])
> > > has accepted an invitation to become a committer on Apache
> > > Arrow. Welcome, and thank you for your contributions!
> > >
> > > Andrew
> > >
> > > [1]: 
> > > https://people.apache.org/phonebook.html?uid=mete
> > >
> >
>


Re: [MATLAB] Using GitHub Projects for Project Planning

2023-08-22 Thread Kevin Gurney
Hi All,

@Kou - thank you very much for creating the GitHub Project!

@Jin and @Rok - you make an excellent point about the namespacing. Also, its 
nice to hear that ADBC used GitHub Projects in the past.

I've updated the project name to "Arrow MATLAB".

If anyone has any other flags or suggestions, please let us know. In the 
meantime, Sarah and I will start using this for planning related to the MATLAB 
Interface.

Thank you!

Best Regards,

Kevin Gurney


From: Rok Mihevc 
Sent: Tuesday, August 22, 2023 8:12 AM
To: dev@arrow.apache.org 
Subject: Re: [MATLAB] Using GitHub Projects for Project Planning

To Jin's point - namespacing like "Arrow MATLAB" would prevent confusion.
We have prior art of "Arrow ADBC Initial Release" [1].

Rok

[1] 
https://github.com/orgs/apache/projects/159<https://github.com/orgs/apache/projects/159>

On Tue, Aug 22, 2023 at 1:31 PM Jin Shang  wrote:

> Hi,
>
> I notice that this project can be seen directly from Apache's github
> page[1], with no indication of Arrow. It seems like the Github Project is
> organization level v.s. repo level. I fear the naming may cause confusion
> for people from other Apache projects.
>
> [1] 
> https://github.com/orgs/apache/projects<https://github.com/orgs/apache/projects>
>
> Best,
> Jin
>
> On Tue, Aug 22, 2023 at 7:22 PM Sutou Kouhei  wrote:
>
> > Hi,
> >
> > I've created a "MATLAB" project:
> > https://github.com/orgs/apache/projects/289<https://github.com/orgs/apache/projects/289>
> > All Arrow committers have "Admin" role.
> >
> > Could you try it?
> >
> > Thanks,
> > --
> > kou
> >
> > In <
> >
> mn2pr05mb6496ba6e36c6c9e08db2f7d2ae...@mn2pr05mb6496.namprd05.prod.outlook.com
> > >
> > "[MATLAB] Using GitHub Projects for Project Planning" on Mon, 21 Aug
> > 2023 19:28:15 +,
> > Kevin Gurney  wrote:
> >
> > > Hi All,
> > >
> > > Sarah (Cc'd) and I have recently been thinking about ways to streamline
> > the way we organize and prioritize GitHub issues related to the MATLAB
> > interface.
> > >
> > > We're interested in trying GitHub Projects [1] to improve this process.
> > >
> > > Does anyone have any flags with creating a "MATLAB" Project under the
> > Projects area of the upstream apache/arrow GitHub repository [2]?
> > >
> > > We appreciate any advice the community has to share on this.
> > >
> > > Thank you!
> > >
> > > [1]
> > https://docs.github.com/en/issues/planning-and-tracking-with-projects<https://docs.github.com/en/issues/planning-and-tracking-with-projects>
> > > [2] 
> > > https://github.com/apache/arrow/projects<https://github.com/apache/arrow/projects>
> > >
> > > Best Regards,
> > >
> > > Kevin Gurney
> >
>


[MATLAB] Using GitHub Projects for Project Planning

2023-08-21 Thread Kevin Gurney
Hi All,

Sarah (Cc'd) and I have recently been thinking about ways to streamline the way 
we organize and prioritize GitHub issues related to the MATLAB interface.

We're interested in trying GitHub Projects [1] to improve this process.

Does anyone have any flags with creating a "MATLAB" Project under the Projects 
area of the upstream apache/arrow GitHub repository [2]?

We appreciate any advice the community has to share on this.

Thank you!

[1] https://docs.github.com/en/issues/planning-and-tracking-with-projects
[2] https://github.com/apache/arrow/projects

Best Regards,

Kevin Gurney


Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback

2023-07-13 Thread Kevin Gurney
Hi All,

Thanks for putting this together, Andrew!

Sarah, Fiona, and I added some notes about the MATLAB interface.

Best Regards,

Kevin Gurney

From: Sutou Kouhei 
Sent: Wednesday, July 12, 2023 9:36 PM
To: dev@arrow.apache.org 
Subject: Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback

Hi,

Thanks! I've added something.

--
kou

In 
"Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback" on Wed, 
12 Jul 2023 16:32:23 -0400,
Andrew Lamb  wrote:

> I apologize, I sent the link out for the last board report
>
> This correct link is [1]
>
> [1]
> https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam<https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam>
>
> On Wed, Jul 12, 2023 at 5:49 AM Andrew Lamb  wrote:
>
>> Hello Arrow Community,
>>
>> TLDR: Please add any comments or board content directly to [2] or reply to
>> this email and I will incorporate your comments. You can see what we
>> currently have at the end of this email.
>>
>> In an epic scheduling fail, I forgot to organize this report a few weeks
>> ago, so now the deadline is tight.
>>
>> One of the responsibilities of being part of the Apache Software Foundation
>> (ASF) is to regularly summarize the state of the project in a quarterly
>> update to the ASF board. I plan to submit the next report on July 14, 2023
>> (in 2 days time -- I am sorry for the late notice)
>>
>> Historically[1], Arrow has crowd sourced the content which has worked
>> well. While this is partly an administrative reporting exercise, I think it
>> is also valuable to reflect on the past and think about goals for the
>> future.
>>
>> It would be especially interesting if anyone from the various language
>> implementation communities could provide an update of a sentence or two.
>>
>> Andrew
>>
>> [1]: 
>> https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5<https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5>
>>
>> [2]:
>>
>> https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#<https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#>
>>
>>
>>
>> ## Description:
>> The mission of Apache Arrow is the creation and maintenance of software
>> related
>> to columnar in-memory processing and data interchange. More information
>> can be found at 
>> https://arrow.apache.org/overview/<https://arrow.apache.org/overview>
>>
>> ## Issues:
>>
>>
>> ## Membership Data:
>> Apache Arrow was founded 2016-01-19 (7 years ago)
>> There are currently 97 committers and 50 PMC members in this project.
>> The Committer-to-PMC ratio is roughly 7:4.
>>
>> Community changes, past quarter:
>> - Ben Baumgold was added to the PMC on 2023-06-19
>> - Jie Wen was added to the PMC on 2023-06-10
>> - Dewey Dunnington was added to the PMC on 2023-06-22
>> - Matthew Topol was added to the PMC on 2023-05-02
>> - Gang Wu was added as committer on 2023-05-15
>> - Kevin Gurney was added as committer on 2023-07-04
>> - Marco Neumann was added as committer on 2023-05-11
>> - Mehmet Ozan Kabak was added as committer on 2023-06-10
>> - Ruihang Xia was added as committer on 2023-04-15
>>
>>
>>
>> ## Project Activity:
>>
>> There has been healthy debate about adding new formats, [StringArray] and
>> [ListView], focused on increasing Arrow’s appeal in high performance
>> computation engines.
>> We have completed the transition from JIRA to using Github issues for the
>> mono repo and that appears to be going well.
>>
>> The DataFusion subproject is considering applying to become its own top
>> level Apache project (see DataFusion update below)
>> [StringArray]:
>> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v<https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v>
>> [ListView]:
>> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb<https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb>
>>
>>
>>
>> ## Community Health:
>>
>>
>> There have been 9 blog posts published to 
>> https://arrow.apache.org/blog/<https://arrow.apache.org/blog>
>> in the last 3 months, including two from community members on their use of
>> Arrow
>>
>>
>> ## Sub Project Updates
>> Arrow has several subprojects, as listed on 
>>

Re: [ANNOUNCE] New Arrow committer: Kevin Gurney

2023-07-05 Thread Kevin Gurney
Thank you all for the kind words and warm welcome!

I feel honored and excited to be part of this vibrant community! I look forward 
to continuing to collaborate with all of you!

Best Regards,

Kevin Gurney

From: Alenka Frim 
Sent: Wednesday, July 5, 2023 12:22 AM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Kevin Gurney

Congratulations!

On Tue, Jul 4, 2023 at 9:41 PM Dewey Dunnington
 wrote:

> Congrats!
>
> On Tue, Jul 4, 2023 at 2:08 PM Matt Topol  wrote:
> >
> > Welcome!
> >
> > On Tue, Jul 4, 2023, 11:06 AM Joris Van den Bossche <
> > jorisvandenboss...@gmail.com> wrote:
> >
> > > Congrats Kevin!
> > >
> > > On Tue, 4 Jul 2023 at 13:47, David Li  wrote:
> > > >
> > > > Welcome Kevin!
> > > >
> > > > On Tue, Jul 4, 2023, at 05:55, Raúl Cumplido wrote:
> > > > > Congratulations Kevin!!!
> > > > >
> > > > > El mar, 4 jul 2023 a las 3:32, Weston Pace ( >)
> > > escribió:
> > > > >>
> > > > >> Congratulations Kevin!
> > > > >>
> > > > >> On Mon, Jul 3, 2023 at 5:18 PM Sutou Kouhei 
> > > wrote:
> > > > >>
> > > > >> > On behalf of the Arrow PMC, I'm happy to announce that Kevin
> Gurney
> > > > >> > has accepted an invitation to become a committer on Apache
> > > > >> > Arrow. Welcome, and thank you for your contributions!
> > > > >> >
> > > > >> > --
> > > > >> > kou
> > > > >> >
> > >
>


Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2023-03-15 Thread Kevin Gurney
Hi All,

Following up on this mailing list discussion.

After several months of work, we now have a working version of libmexclass [1] 
to support development of the MATLAB Interface to Arrow.

We've opened a pull request [2] for integrating libmexclass with the Arrow 
codebase.

Making this a reality has involved a considerable amount of work by a number of 
MathWorkers including Fiona La, Sreehari Hegden, Sarah Gilmore, Jeremy Hughes, 
and many others. Thanks to everyone for their help in making this possible!

Note that, libmexclass is still a relatively new project and continues to be 
tweaked and refined as we encounter new use cases.

Please let us know if you have any questions.

[1] https://github.com/mathworks/libmexclass
[2] https://github.com/apache/arrow/pull/34563

Best Regards,

Kevin Gurney

From: Kevin Gurney 
Sent: Friday, July 15, 2022 10:38 AM
To: Sutou Kouhei ; dev@arrow.apache.org 

Cc: Fiona La ; Jeremy Hughes ; 
Nick Haddad 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

Hi Kou,

Thank you for confirming the license compatibility!

We will continue to keep the community updated as development efforts progress 
on libmexclass, and we prepare for integration with the upstream Arrow codebase.

Best Regards,

Kevin Gurney


From: Sutou Kouhei 
Sent: Wednesday, July 13, 2022 9:33 PM
To: dev@arrow.apache.org 
Cc: Kevin Gurney ; Fiona La ; 
Jeremy Hughes ; Nick Haddad 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

Hi,

Thanks for sharing progress!

There is no problem with the BSD 3-Clause license because
it's compatible with Apache License 2.0 and listed in
Category A:
https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a>

> Category A: Licenses in Category A may be included in
> Apache Software Foundation products. They are said to be
> "Apache-like".


Thanks,
--
kou

In 

"Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
using MEX" on Wed, 13 Jul 2022 19:20:40 +,
Kevin Gurney  wrote:

> Hi All,
>
> I am following up to close the loop on this. Apologies for the delay. We had 
> to work through some technical and procedural issues before releasing the 
> code.
>
> Updates:
>
> 1. We decided to release the code under the BSD 3-Clause [1], rather than the 
> BSD 2-Clause license.
>
> If there are any concerns about this licensing change, please let us know. 
> Our understanding is that BSD 3-Clause license should still be compatible 
> with the licensing of the upstream Arrow codebase. We apologize for the 
> confusion.
>
> 2. Initial code has been released under the MathWorks GitHub organization in 
> a repository named "libmexclass" [2].
>
> We chose the name "libmexclass" because the project aims to enable users to 
> implement MATLAB classes in terms of calls to corresponding C++ classes using 
> MEX [3].
>
> The code is under active development and is not yet ready for integration 
> with the Arrow codebase. However, we wanted to get the code on GitHub as soon 
> as possible so that anyone who is interested can feel free to follow 
> development progress. We welcome any contributions from Arrow community 
> members!
>
> Once the code has matured a bit more, we will work with the Arrow community 
> to update the build infrastructure for the MATLAB Interface to Arrow to make 
> use of libmexclass. Our hope is that using libmexclass will help unblock and 
> streamline development efforts for the MATLAB interface.
>
> Thank you again to the community for providing helpful feedback and enabling 
> us to move forward.
>
> [1] 
> https://github.com/mathworks/libmexclass/blob/main/LICENSE<https://github.com/mathworks/libmexclass/blob/main/LICENSE>
> [2] 
> https://github.com/mathworks/libmexclass<https://github.com/mathworks/libmexclass>
> [3] https://www.mathworks.com/help/matlab/call-mex-files-1.html
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Sutou Kouhei 
> Sent: Sunday, June 12, 2022 10:12 PM
> To: dev@arrow.apache.org 
> Cc: Fiona La ; Jeremy Hughes ; 
> Nick Haddad ; Kevin Gurney 
> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
> objects using MEX
>
> +1
>
> In 
> 
> "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
> using MEX" on Fri, 10 Jun 2022 18:22:47 +,
> Kevin Gurney  wrote:
>
>> Hi Kou,
>>
>> Thank you for helping to clear up our confusion.
>>
>>> How do we install the object dispatch layer to use it in
>>> apache/arrow? I assumed

Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

2023-02-28 Thread Kevin Gurney
Hi Alenka,

Thank you. I've informed my colleagues at MathWorks to add any further comments 
to the PR.

My apologies for bringing this up on the voting thread.

Best Regards,

Kevin Gurney


From: Alenka Frim 
Sent: Tuesday, February 28, 2023 4:19 AM
To: dev@arrow.apache.org 
Subject: Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

This was actually already meant as the voting thread, but given it sparked
some more discussion, let's give this a few more days, and then re-start
with a new vote thread.

*So if someone still has comments on the current text, please bring those
up here or in the PR*: 
https://github.com/apache/arrow/pull/33925<https://github.com/apache/arrow/pull/33925>.

Alenka

On Fri, Feb 24, 2023 at 10:15 AM Kevin Gurney  wrote:

> Hi All,
>
> Thank you very much for creating this proposal, Alenka!
>
> I noticed the following in the notes [1] shared from the February 15th
> Arrow Community Meeting:
>
> "Members of Hugging Face, Ray, and PyTorch community have given input and
> some of it was incorporated - It would be good to have input from some
> other companies and project communities including Lance, NumPy, Posit,
> ​MATLAB, DLPack, CUDA/RAPIDS, Arrow Rust, Xarray, Julia, Fortran,
> TensorFlow, LinkedIn"
>
> Based on the inclusion of MATLAB in the list above, I've shared this
> proposal with some colleagues at MathWorks who have expertise in the deep
> learning area. They will respond here if they have any additional input to
> add.
>
> That being said, I recognize that this proposal is already nearing the
> voting phase.
>
> [1] 
> https://lists.apache.org/thread/bblcwwq7gl1x2hsr1qsormv9f3vr23jn<https://lists.apache.org/thread/bblcwwq7gl1x2hsr1qsormv9f3vr23jn>
>
> Best Regards,
>
> Kevin Gurney
>
> 
> From: Rok Mihevc 
> Sent: Thursday, February 23, 2023 8:12 AM
> To: dev@arrow.apache.org 
> Subject: Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type
>
> That makes sense indeed.
> Do we have any more comments on the language of the proposal [1] or should
> we proceed to vote?
>
> Rok
>
> [1] 
> https://github.com/apache/arrow/pull/33925/files<https://github.com/apache/arrow/pull/33925/files><
> https://github.com/apache/arrow/pull/33925/files<https://github.com/apache/arrow/pull/33925/files>>
>
> On Wed, Feb 22, 2023 at 2:13 PM Antoine Pitrou  wrote:
>
> >
> > That's a good point.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 22/02/2023 à 14:11, Dewey Dunnington a écrit :
> > > I don't think having both dimension names and permutation is
> > > redundant...dimension names can also serve as human-readable tags that
> > help
> > > a human interpret the values. If reading a NetCDF, for example, one
> might
> > > store the dimension variable names. When determining type equality it
> may
> > > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H",
> > "W"]}
> > > is not equal to {..., permutation = [2, 0, 1], dim_names = ["x", "y",
> > "z"]}.
> > >
> > > On Wed, Feb 22, 2023 at 4:56 AM Rok Mihevc 
> wrote:
> > >
> > >>>
> > >>>>>
> > >>>>> Should we rule that `dim_names` and `permutation` are mutually
> > >>> exclusive?
> > >>>>>
> > >>>>
> > >>>> Since `dim_names` have to "map to the physical layout (row-major)"
> > that
> > >>>> means permutation will always be trivial which indeed makes it
> > >>> unnecessary
> > >>>> to store both.
> > >>>
> > >>> I don't think it is necessarily needed to explicitly make them
> > >>> mutually exclusive. I don't know how useful this would in practice,
> > >>> but you certainly *can* specify both in a meaningful way. Re-using
> the
> > >>> example of NHWC data, which is physically stored as NCHW, you can
> keep
> > >>> track of this by specifying a permutation of [2, 0, 1], but at the
> > >>> same time you could also still save the dimension names as ["C", "H",
> > >>> "W"].
> > >>>
> > >>
> > >> I'll advocate for the original comment, but I'm ok either way. Having
> > both
> > >> `dim_names` and `permutation` is redundant - if the user knows their
> > >> desired order of `dim_names` they can derive the permutation. If they
> > don't
> > >> use `dim_names` they probably don't want them.
> > >>
> > >
> >
>


Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

2023-02-24 Thread Kevin Gurney
Hi All,

Thank you very much for creating this proposal, Alenka!

I noticed the following in the notes [1] shared from the February 15th Arrow 
Community Meeting:

"Members of Hugging Face, Ray, and PyTorch community have given input and some 
of it was incorporated - It would be good to have input from some other 
companies and project communities including Lance, NumPy, Posit, ​MATLAB, 
DLPack, CUDA/RAPIDS, Arrow Rust, Xarray, Julia, Fortran, TensorFlow, LinkedIn"

Based on the inclusion of MATLAB in the list above, I've shared this proposal 
with some colleagues at MathWorks who have expertise in the deep learning area. 
They will respond here if they have any additional input to add.

That being said, I recognize that this proposal is already nearing the voting 
phase.

[1] https://lists.apache.org/thread/bblcwwq7gl1x2hsr1qsormv9f3vr23jn

Best Regards,

Kevin Gurney


From: Rok Mihevc 
Sent: Thursday, February 23, 2023 8:12 AM
To: dev@arrow.apache.org 
Subject: Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

That makes sense indeed.
Do we have any more comments on the language of the proposal [1] or should
we proceed to vote?

Rok

[1] 
https://github.com/apache/arrow/pull/33925/files<https://github.com/apache/arrow/pull/33925/files>

On Wed, Feb 22, 2023 at 2:13 PM Antoine Pitrou  wrote:

>
> That's a good point.
>
> Regards
>
> Antoine.
>
>
> Le 22/02/2023 à 14:11, Dewey Dunnington a écrit :
> > I don't think having both dimension names and permutation is
> > redundant...dimension names can also serve as human-readable tags that
> help
> > a human interpret the values. If reading a NetCDF, for example, one might
> > store the dimension variable names. When determining type equality it may
> > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H",
> "W"]}
> > is not equal to {..., permutation = [2, 0, 1], dim_names = ["x", "y",
> "z"]}.
> >
> > On Wed, Feb 22, 2023 at 4:56 AM Rok Mihevc  wrote:
> >
> >>>
> >>>>>
> >>>>> Should we rule that `dim_names` and `permutation` are mutually
> >>> exclusive?
> >>>>>
> >>>>
> >>>> Since `dim_names` have to "map to the physical layout (row-major)"
> that
> >>>> means permutation will always be trivial which indeed makes it
> >>> unnecessary
> >>>> to store both.
> >>>
> >>> I don't think it is necessarily needed to explicitly make them
> >>> mutually exclusive. I don't know how useful this would in practice,
> >>> but you certainly *can* specify both in a meaningful way. Re-using the
> >>> example of NHWC data, which is physically stored as NCHW, you can keep
> >>> track of this by specifying a permutation of [2, 0, 1], but at the
> >>> same time you could also still save the dimension names as ["C", "H",
> >>> "W"].
> >>>
> >>
> >> I'll advocate for the original comment, but I'm ok either way. Having
> both
> >> `dim_names` and `permutation` is redundant - if the user knows their
> >> desired order of `dim_names` they can derive the permutation. If they
> don't
> >> use `dim_names` they probably don't want them.
> >>
> >
>


Re: Proposal: renaming the 'master' branch to 'main'

2023-02-17 Thread Kevin Gurney
Hi Joris,

This is great news! Thank you to everyone who made this possible!

It's encouraging to see steps being taken to make the Apache Arrow community 
more inclusive.

Best Regards,

Kevin Gurney

From: Joris Van den Bossche 
Sent: Friday, February 17, 2023 3:06 AM
To: dev@arrow.apache.org 
Subject: Re: Proposal: renaming the 'master' branch to 'main'

Also for https://github.com/apache/arrow<https://github.com/apache/arrow> the 
default branch is now
renamed to "main".

You will see some instructions the first time visiting the github repo
since the rename, but copying them here below. You can rename the
master branch on your fork as well (visiting
https://github.com/<https://github.com>/arrow will prompt a message 
to do this).
After that, assuming your fork is called "origin" locally, the
instructions from github to update your local clone:

git branch -m master main
git fetch origin
git branch -u origin/main main
git remote set-head origin -a

On Thu, 16 Feb 2023 at 12:59, Andy Grove  wrote:
>
> https://github.com/apache/arrow-datafusion<https://github.com/apache/arrow-datafusion>
>  default branch is now "main".
>
> Andy.
>
> On Tue, Feb 14, 2023 at 7:17 AM Andy Grove  wrote:
>
> > I would like to rename the default branch in arrow-datafusion next. I have
> > a PR up with the required changes:
> >
> > https://github.com/apache/arrow-datafusion/pull/5280<https://github.com/apache/arrow-datafusion/pull/5280>
> >
> > I will submit a request to INFRA in the next few days to rename the
> > default branch if there are no objections, and I will merge this PR once
> > that has happened.
> >
> > Thanks,
> >
> > Andy.
> >
> >
> >
> > On Mon, Jan 23, 2023 at 12:30 PM Andy Grove  wrote:
> >
> >> The default branch in 
> >> https://github.com/apache/arrow-ballista<https://github.com/apache/arrow-ballista>
> >>  is also
> >> now "main".
> >>
> >> I will wait until we have gone through the release process in Ballista
> >> and DataFusion Python before proposing that we make these same changes in
> >> DataFusion.
> >>
> >> On Sun, Jan 22, 2023 at 8:10 AM Andy Grove  wrote:
> >>
> >>> The default branch in 
> >>> https://github.com/apache/arrow-datafusion-python<https://github.com/apache/arrow-datafusion-python>
> >>> is now main.
> >>>
> >>> The process was simple for this repo - file an issue with INFRA, create
> >>> a PR to replace master with main where appropriate (docs and workflows).
> >>>
> >>> I plan on doing the same for Ballista and DataFusion over the next week
> >>> or two. I am not very active in arrow-rs, so I will leave that one for
> >>> someone else to pick up.
> >>>
> >>> On Thu, Jan 19, 2023 at 3:11 PM Jacob Wujciak
> >>>  wrote:
> >>>
> >>>> Thanks for moving this forward! In apache/arrow we have an umbrella
> >>>> issue
> >>>> [1] and I think we are at the point where we are ready to ask INFRA to
> >>>> rename the default branch to main.
> >>>> Thanks to Fiona and Kevin for their work on the hardcoded branch names
> >>>> in
> >>>> the dev tooling that was blocking this!
> >>>>
> >>>> Unless there are any objections we can initiate the switch after the
> >>>> release.
> >>>>
> >>>> [1]: 
> >>>> https://github.com/apache/arrow/issues/31142<https://github.com/apache/arrow/issues/31142>
> >>>>
> >>>> On Thu, Jan 19, 2023 at 8:49 PM MAURICIO ANDRES VARGAS SEPULVEDA
> >>>>  wrote:
> >>>>
> >>>> > yes, please!
> >>>> >
> >>>> >
> >>>> > ———
> >>>> >
> >>>> > Mauricio Vargas Sepulveda
> >>>> >
> >>>> > Master's Representative
> >>>> >
> >>>> > UofT Political Science
> >>>> >
> >>>> > 
> >>>> > From: Andy Grove 
> >>>> > Sent: January 19, 2023 2:44 PM
> >>>> > To: dev@arrow.apache.org 
> >>>> > Subject: Re: Proposal: renaming the 'master' branch to 'main'
> >>>> >
> >>>> > I have filed issues in the Arrow Rust repos (let me know if I missed
> >>>> any)
> >>>> >
> &

Re: Adding a CODEOWNERS file

2023-01-12 Thread Kevin Gurney
Yes, thank you for taking the initiative on this, Jacob!

Best Regards,

Kevin Gurney

From: Antoine Pitrou 
Sent: Thursday, January 12, 2023 6:04 AM
To: dev@arrow.apache.org 
Subject: Re: Adding a CODEOWNERS file


This sounds like a good idea to me. Thanks for doing this!

Regards

Antoine.


Le 12/01/2023 à 01:06, Jacob Wujciak a écrit :
> Hello Everyone,
>
> As discussed in an issue spawned by the state of the project thread [1] I
> have created a draft PR that adds a CODEOWNERS file to apache/arrow [2].
>
> Adding a CODEOWNERS file will allow committers to be automatically
> requested for reviews that they are interested in (based on touched files,
> enabling them to basically "subscribe" to a selection of PRs based on their
> interests/competence within the monorepo without having to watch all
> notifications for the repo.
> The main advantage in my opinion is, that it removes the burden of finding
> an (initial) reviewer for a PR for contributors, which is a major block in
> the arrow dev workflow, especially for new contributors.
>
> Note that adding a CODEOWNERS file will not automatically activate the
> branch protection rules to enforce a codeowner review on the respective
> code.
>
> Please review the PR and add yourself to the file via suggestion or direct
> push to the branch! Documentation on CODEOWNERS file and syntax: [3]
>
> Thanks,
>
> Jacob
> [1]: 
> https://github.com/apache/arrow/issues/15232<https://github.com/apache/arrow/issues/15232>
> [2]: 
> https://github.com/apache/arrow/pull/33622<https://github.com/apache/arrow/pull/33622>
> [3]:
> https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners<https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners>
>


Re: [DISCUSS] State of the Arrow Project 2022

2023-01-06 Thread Kevin Gurney
Thank you for starting this discussion, Andrew!

Fiona, Sreehari, and I thought a bit about this, and I've summarized some of 
our thoughts below.

Continue:

1. +1 to Will's suggestion about roadmaps for sub-projects. This is something 
that would be helpful for the MATLAB interface, for example. We would also be 
interested in the possibility of exploring a MATLAB sync call if it would be of 
interest to other community members.

2. Continue focusing on building an inclusive developer community. Finish the 
work required to rename the master branch to main. Consider running automated 
checks on pull requests using a tool like alex [1] to prevent use of 
inappropriate language and terminology.

Start:

1. Add more visuals and diagrams to the documentation. It can be pretty 
overwhelming for new community members to look at the in-depth Arrow C++ 
documentation and be able to quickly get a high-level understanding of how the 
various data structures (e.g. buffer, array, chunked array, record batch, 
table, field, schema, data type, etc.) relate to one another. Having more 
visuals with clear labels that show the relationship between these key concepts 
would be very helpful. This also applies to other parts of the documentation, 
like the CI systems (e.g. crossbow), which have a lot of moving parts.

2. Use pull request templates. This would hopefully make it easier for both new 
and existing contributors to describe their changes in a focused and clear way 
to others. For example, when making pull requests related to the MATLAB 
interface, we've been trying to follow a fairly consistent pattern for pull 
request descriptions which includes sections like "Overview", "Implementation", 
"Testing", "Future Directions", "Notes", etc.

Stop:

1. +1 to Andrew's point about the reliance on a small number of core 
contributors for code reviews. Documenting a process for determining who should 
be included on a code review would be helpful.

[1] https://github.com/get-alex/alex


From: Dewey Dunnington 
Sent: Tuesday, January 3, 2023 2:33 PM
To: dev@arrow.apache.org 
Subject: Re: [DISCUSS] State of the Arrow Project 2022

First, a +1000 on Will's blog post! [1]

Continue:

Building tools that benefit users of all languages, with particular kudos
to ADBC for providing an ABI-stable way to write database drivers that can
be used by practitioners in C++, Ruby, Python, Java, Go, and (soon!) R.

Start:

I wonder if this is the year that we can find a way to write compute
functions in such a way that separate implementations don't have to exist
for C++, Go, and Rust (and maybe others I don't know about).

Stop:

Will's comment that we should stop building data scientist-facing tools
under the Arrow name struck a particular chord with me...the R package is
very much data scientist facing and we have a rather large disjoint between
the technical capacity of our users and the technical capacity required to
contribute to the package (e.g., maintaining a development Arrow C++
install). The types of things we have to do to make RecordBatchReader,
Arrays, Buffer, RecordBatch and Table structures available to R users and
the types of things we have to do to provide an Acero dplyr backend are
vastly different.

[1] 
https://www.datawill.io/posts/apache-arrow-2022-reflection/

On Thu, Dec 29, 2022 at 4:09 PM Jacob Wujciak 
wrote:

> This is a great idea, I will add some thoughts later but just wanted to
> quickly add that the Zulip Chat [1] was recently switched to allow anyone
> to register without the need for an invite link!
> [1]: https://ursalabs.zulipchat.com/
>
>
> On Wed, Dec 28, 2022 at 11:27 PM Will Jones 
> wrote:
>
> > Thanks for suggesting this Andrew.
> >
> > I just uploaded a blog post with my thoughts in long form [1]. Here are
> > some suggestions pulled from that:
> >
> > Continue:
> >
> > I hope we will continue prioritizing updating the spec for new array
> > formats. [2] I think this is very important for avoiding fragmentation
> and
> > may even open opportunities for consolidation in the C++ ecosystem.
> >
> > +1 on additional improvements for documentation, examples, no-invite
> chats.
> > I am particularly keen on seeing evangelism for our protocols; existing
> > ones like C Data Interface aren't nearly as widely known as they ought to
> > be and I'm excited for new ones like ADBC.
> >
> > Start:
> >
> > Find ways for each subproject to publicly develop a clear roadmap.
> > Otherwise by default these discussions happen in private, either between
> > individual ICs or within corporate environments. Some subprojects, such
> as
> > Acero could likely use their own sync call to help facilitate this, even
> if
> > on a slower cadence than the main biweekly call.
> >
> > Also, other sync calls might consider adapting to the sync call note
> style
> > used in the Rust projects, where all 

Re: Apache Arrow Board Report, by Jan 11 2023

2023-01-06 Thread Kevin Gurney
Sreehari, Fiona, and I added a few notes about progress on the MATLAB interface.

Best Regards,

Kevin Gurney

From: Andrew Lamb 
Sent: Wednesday, January 4, 2023 7:24 PM
To: u...@arrow.apache.org ; dev 
Subject: Re: Apache Arrow Board Report, by Jan 11 2023

Thank you Jacob and Matthew -- the level of detail in your suggestions
looks just about perfect. ‍♂️

On Wed, Jan 4, 2023 at 12:20 PM Jacob Quinn  wrote:

> I added a few notes on the Julia implementation.
>
> -Jacob
>
> On Tue, Dec 27, 2022 at 2:45 PM Andrew Lamb  wrote:
>
>> Hello Arrow Community,
>>
>> One of the (possibly the only) responsibilities of the PMC chair is to
>> collect information on the project and submit quarterly updates to the ASF
>> board. The next one is due on January 11, 2023
>>
>> Historically[1], Arrow has crowd sourced the content and I plan to
>> continue the tradition.
>>
>> Please feel free to add your comments directly to [2] or reply to this
>> email and I will incorporate your comments.
>>
>> I think it would be especially interesting if anyone from the following
>> implementation communities wanted to provide any updates:
>>
>> ### C++
>> ### C#
>> ### Go
>> ### Java
>> ### JavaScript
>> ### Julia
>> ### Rust
>> ### C (Glib)
>> ### MATLAB
>> ### Python
>> ### R
>> ### Ruby
>>
>> Thank you,
>> ANdrew
>>
>>
>> [1] 
>> https://lists.apache.org/thread/w7lwr7t979oqsqb8qz4smtg9wmj9f48s<https://lists.apache.org/thread/w7lwr7t979oqsqb8qz4smtg9wmj9f48s>
>> [2]
>> https://docs.google.com/document/d/12ybofzyB8FGlsWV6IxefAUqRDgxFR-Kk53kPk7LuKDY/edit?usp=sharing<https://docs.google.com/document/d/12ybofzyB8FGlsWV6IxefAUqRDgxFR-Kk53kPk7LuKDY/edit?usp=sharing>
>>
>>


Re: Proposal: renaming the 'master' branch to 'main'

2022-07-26 Thread Kevin Gurney
Hi All,

Thank you for bringing up this topic again.

@Neal - I am already the assignee for ARROW-15693, which involves updating 
crossbow. I will work on completing this as soon as possible.

Best Regards,

Kevin Gurney


From: Remzi Yang <1371656737...@gmail.com>
Sent: Monday, July 25, 2022 11:32 PM
To: dev@arrow.apache.org 
Subject: Re: Proposal: renaming the 'master' branch to 'main'

Should we also do this change in arrow-rs?

Remzi

On Tue, 26 Jul 2022 at 11:25, Neal Richardson 
wrote:

> Many of the subtasks on 
> https://issues.apache.org/jira/browse/ARROW-15689<https://issues.apache.org/jira/browse/ARROW-15689>
> have already been done. What's left is to update archery and crossbow, then
> we can ask Infra to make the switch. Is anyone able to take those subtasks
> on?
>
> Neal
>
> On Mon, Jul 25, 2022 at 4:58 PM Matthew Topol  >
> wrote:
>
> > I'm in favor of it, for what it's worth.
> >
> > --Matt
> >
> > On Mon, Jul 25 2022 at 02:56:31 PM -0600, Wes McKinney
> >  wrote:
> > > hi all,
> > >
> > > Do you think we could make a push to make this happen after the 9.0.0
> > > release goes out?
> > >
> > > Thanks
> > > Wes
> > >
> > > On Tue, Feb 15, 2022 at 2:32 PM Fiona La  > > <mailto:fion...@mathworks.com>> wrote:
> > >>
> > >> Thank you Antoine for bringing up the engineering work that is
> > >> required to enable this. And thank you Neal for sharing the link to
> > >> the previous discussion and creating the umbrella issue/breaking
> > >> down the tasks in Jira.
> > >>
> > >> I am happy to work on these subtasks to move the effort forward;
> > >> I’ll start with this subtask [1].
> > >>
> > >> [1]: 
> > >> <https://issues.apache.org/jira/browse/ARROW-15692<https://issues.apache.org/jira/browse/ARROW-15692>>
> > >>
> > >>
> > >> From: Neal Richardson  > >> <mailto:neal.p.richard...@gmail.com>>
> > >> Date: Tuesday, February 15, 2022 at 3:38 PM
> > >> To: dev mailto:dev@arrow.apache.org>>
> > >> Subject: Re: Proposal: renaming the 'master' branch to 'main'
> > >> Good point Antoine, there is engineering work to be done to allow
> > >> this.
> > >> I've made an umbrella issue [1] and broke out a bunch of subtasks.
> > >> The idea
> > >> is to update our code (mostly CI stuff) to use the default branch,
> > >> whatever
> > >> it happens to be called, and then after that is done we will be
> > >> able to ask
> > >> INFRA to make the change. Those who are interested in seeing this
> > >> happen
> > >> can work on those subtasks, and we can check back once those are
> > >> done.
> > >>
> > >> Neal
> > >>
> > >> [1]:
> > >> <https://issues.apache.org/jira/browse/ARROW-15689<https://issues.apache.org/jira/browse/ARROW-15689>><<
> > https://issues.apache.org/jira/browse/ARROW-15689<https://issues.apache.org/jira/browse/ARROW-15689>>>
> > >>
> > >> On Mon, Feb 14, 2022 at 3:46 PM Antoine Pitrou  > >> <mailto:anto...@python.org>> wrote:
> > >>
> > >> >
> > >> > Le 14/02/2022 à 21:45, Neal Richardson a écrit :
> > >> > > There was discussion of this back in 2020 [1], and the
> > >> consensus at the
> > >> > > time seemed to be to wait and see where git and GitHub would
> > >> land before
> > >> > > making what could be a disruptive change. I support reopening
> > >> the
> > >> > > discussion.
> > >> > >
> > >> > > It looks like quite a few ASF projects have switched to 'main'
> > >> by now
> > >> > [2],
> > >> > > and in order to do so, we'd need a vote/consensus thread here
> > >> and then
> > >> > make
> > >> > > an INFRA issue (like [3]).
> > >> >
> > >> > We also need someone to take responsibility for ensuring that all
> > >> CI
> > >> > jobs and other automation is updated for the change. Presumably
> > >> that
> > >> > could be the proponent.
> > >> >
> > >> > Regards
> > >> >
> > >> > Antoine.
> > >> >
> > >> >
> > >> > &g

Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2022-07-15 Thread Kevin Gurney
Hi Kou,

Thank you for confirming the license compatibility!

We will continue to keep the community updated as development efforts progress 
on libmexclass, and we prepare for integration with the upstream Arrow codebase.

Best Regards,

Kevin Gurney


From: Sutou Kouhei 
Sent: Wednesday, July 13, 2022 9:33 PM
To: dev@arrow.apache.org 
Cc: Kevin Gurney ; Fiona La ; 
Jeremy Hughes ; Nick Haddad 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

Hi,

Thanks for sharing progress!

There is no problem with the BSD 3-Clause license because
it's compatible with Apache License 2.0 and listed in
Category A:
https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a>

> Category A: Licenses in Category A may be included in
> Apache Software Foundation products. They are said to be
> "Apache-like".


Thanks,
--
kou

In 

"Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
using MEX" on Wed, 13 Jul 2022 19:20:40 +,
Kevin Gurney  wrote:

> Hi All,
>
> I am following up to close the loop on this. Apologies for the delay. We had 
> to work through some technical and procedural issues before releasing the 
> code.
>
> Updates:
>
> 1. We decided to release the code under the BSD 3-Clause [1], rather than the 
> BSD 2-Clause license.
>
> If there are any concerns about this licensing change, please let us know. 
> Our understanding is that BSD 3-Clause license should still be compatible 
> with the licensing of the upstream Arrow codebase. We apologize for the 
> confusion.
>
> 2. Initial code has been released under the MathWorks GitHub organization in 
> a repository named "libmexclass" [2].
>
> We chose the name "libmexclass" because the project aims to enable users to 
> implement MATLAB classes in terms of calls to corresponding C++ classes using 
> MEX [3].
>
> The code is under active development and is not yet ready for integration 
> with the Arrow codebase. However, we wanted to get the code on GitHub as soon 
> as possible so that anyone who is interested can feel free to follow 
> development progress. We welcome any contributions from Arrow community 
> members!
>
> Once the code has matured a bit more, we will work with the Arrow community 
> to update the build infrastructure for the MATLAB Interface to Arrow to make 
> use of libmexclass. Our hope is that using libmexclass will help unblock and 
> streamline development efforts for the MATLAB interface.
>
> Thank you again to the community for providing helpful feedback and enabling 
> us to move forward.
>
> [1] 
> https://github.com/mathworks/libmexclass/blob/main/LICENSE<https://github.com/mathworks/libmexclass/blob/main/LICENSE>
> [2] 
> https://github.com/mathworks/libmexclass<https://github.com/mathworks/libmexclass>
> [3] https://www.mathworks.com/help/matlab/call-mex-files-1.html
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Sutou Kouhei 
> Sent: Sunday, June 12, 2022 10:12 PM
> To: dev@arrow.apache.org 
> Cc: Fiona La ; Jeremy Hughes ; 
> Nick Haddad ; Kevin Gurney 
> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
> objects using MEX
>
> +1
>
> In 
> 
> "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
> using MEX" on Fri, 10 Jun 2022 18:22:47 +,
> Kevin Gurney  wrote:
>
>> Hi Kou,
>>
>> Thank you for helping to clear up our confusion.
>>
>>> How do we install the object dispatch layer to use it in
>>> apache/arrow? I assumed that something like the following:
>>>
>>> 
>>> $ git clone 
>>> https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git><https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git>>
>>> $ cd object-dispatch-layer
>>> $ cmake -S . -B build ...
>>> $ cmake --build build
>>> $ cmake --install build
>>> $ git clone 
>>> https://github.com/apache/arrow.git<https://github.com/apache/arrow.git><https://github.com/apache/arrow.git<https://github.com/apache/arrow.git>>
>>> $ cd apache/matlab
>>> $ cmake -S . -B build # This find installed object-dispatch-layer
>>> $ cmake --build build
>>> $ cmake --install build
>>> 
>>>
>>> My assumption is right?
>>
>> Your understanding is correct. Thanks for checking.
>>
>>> BTW, why do you want to use "git submodule&quo

Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2022-07-13 Thread Kevin Gurney
Hi All,

I am following up to close the loop on this. Apologies for the delay. We had to 
work through some technical and procedural issues before releasing the code.

Updates:

1. We decided to release the code under the BSD 3-Clause [1], rather than the 
BSD 2-Clause license.

If there are any concerns about this licensing change, please let us know. Our 
understanding is that BSD 3-Clause license should still be compatible with the 
licensing of the upstream Arrow codebase. We apologize for the confusion.

2. Initial code has been released under the MathWorks GitHub organization in a 
repository named "libmexclass" [2].

We chose the name "libmexclass" because the project aims to enable users to 
implement MATLAB classes in terms of calls to corresponding C++ classes using 
MEX [3].

The code is under active development and is not yet ready for integration with 
the Arrow codebase. However, we wanted to get the code on GitHub as soon as 
possible so that anyone who is interested can feel free to follow development 
progress. We welcome any contributions from Arrow community members!

Once the code has matured a bit more, we will work with the Arrow community to 
update the build infrastructure for the MATLAB Interface to Arrow to make use 
of libmexclass. Our hope is that using libmexclass will help unblock and 
streamline development efforts for the MATLAB interface.

Thank you again to the community for providing helpful feedback and enabling us 
to move forward.

[1] https://github.com/mathworks/libmexclass/blob/main/LICENSE
[2] https://github.com/mathworks/libmexclass
[3] https://www.mathworks.com/help/matlab/call-mex-files-1.html

Best Regards,

Kevin Gurney

From: Sutou Kouhei 
Sent: Sunday, June 12, 2022 10:12 PM
To: dev@arrow.apache.org 
Cc: Fiona La ; Jeremy Hughes ; 
Nick Haddad ; Kevin Gurney 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

+1

In 

"Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
using MEX" on Fri, 10 Jun 2022 18:22:47 +,
Kevin Gurney  wrote:

> Hi Kou,
>
> Thank you for helping to clear up our confusion.
>
>> How do we install the object dispatch layer to use it in
>> apache/arrow? I assumed that something like the following:
>>
>> 
>> $ git clone 
>> https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git>
>> $ cd object-dispatch-layer
>> $ cmake -S . -B build ...
>> $ cmake --build build
>> $ cmake --install build
>> $ git clone 
>> https://github.com/apache/arrow.git<https://github.com/apache/arrow.git>
>> $ cd apache/matlab
>> $ cmake -S . -B build # This find installed object-dispatch-layer
>> $ cmake --build build
>> $ cmake --install build
>> 
>>
>> My assumption is right?
>
> Your understanding is correct. Thanks for checking.
>
>> BTW, why do you want to use "git submodule" to use the
>> object dispatch layer? Why don't you install it separately
>> or build by externalproject_add() in CMake?
>> https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html><https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html>>
>
> After reflecting on your response, we realize that using a git submodule 
> seems like a less than ideal solution.
>
> Initially, we were thinking that if code within the apache/arrow repository 
> were to "depend" on some MATLAB files from the object dispatch layer, that we 
> would need to "physically" (via vendoring or IP Clearance) or "virtually" 
> (via git submodule) integrate this code into the apache/arrow source tree. 
> However, since these files are only needed at build time / run time, this 
> means that the object dispatch layer code does not necessarily need to be 
> redistributable along with the rest of the code in the apache/arrow 
> repository.
>
> It seems much clearer now that the object dispatch layer can be treated as a 
> "pure" external "library" dependency, and thus, the code should not need to 
> be present in the apache/arrow repository. Per your suggestion, at build / 
> install time, it should be possible to copy any required MATLAB files or C++ 
> header files to appropriate locations, so that they can be used by CMake and 
> MATLAB.
>
> Using externalproject_add() is a great idea and seems more "in-model" than 
> repeatedly bumping the version of a vendored copy of the object dispatch 
> layer source or using a git submodule.
>
> To summarize, it sounds lik

Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2022-06-10 Thread Kevin Gurney
Hi Kou,

Thank you for helping to clear up our confusion.

> How do we install the object dispatch layer to use it in
> apache/arrow? I assumed that something like the following:
>
> 
> $ git clone https://github.com/mathworks/object-dispatch-layer.git
> $ cd object-dispatch-layer
> $ cmake -S . -B build ...
> $ cmake --build build
> $ cmake --install build
> $ git clone https://github.com/apache/arrow.git
> $ cd apache/matlab
> $ cmake -S . -B build # This find installed object-dispatch-layer
> $ cmake --build build
> $ cmake --install build
> 
>
> My assumption is right?

Your understanding is correct. Thanks for checking.

> BTW, why do you want to use "git submodule" to use the
> object dispatch layer? Why don't you install it separately
> or build by externalproject_add() in CMake?
> https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html>

After reflecting on your response, we realize that using a git submodule seems 
like a less than ideal solution.

Initially, we were thinking that if code within the apache/arrow repository 
were to "depend" on some MATLAB files from the object dispatch layer, that we 
would need to "physically" (via vendoring or IP Clearance) or "virtually" (via 
git submodule) integrate this code into the apache/arrow source tree. However, 
since these files are only needed at build time / run time, this means that the 
object dispatch layer code does not necessarily need to be redistributable 
along with the rest of the code in the apache/arrow repository.

It seems much clearer now that the object dispatch layer can be treated as a 
"pure" external "library" dependency, and thus, the code should not need to be 
present in the apache/arrow repository. Per your suggestion, at build / install 
time, it should be possible to copy any required MATLAB files or C++ header 
files to appropriate locations, so that they can be used by CMake and MATLAB.

Using externalproject_add() is a great idea and seems more "in-model" than 
repeatedly bumping the version of a vendored copy of the object dispatch layer 
source or using a git submodule.

To summarize, it sounds like a reasonable path forward would be to:

1. Develop the object dispatch layer in an external repository underneath the 
MathWorks GitHub organization, with a 2-Clause BSD license.
2. Use externalproject_add() to fetch and build the source code dynamically.

Once the object dispatch layer is available on GitHub, I will follow up on this 
email thread with a link to the repository so that anyone in the community can 
track development progress, as well as contribute to the framework, if they are 
interested.

If anyone has any objections to this approach, please let us know.

Thank you!

Kevin Gurney


From: Sutou Kouhei 
Sent: Friday, June 10, 2022 4:13 AM
To: dev@arrow.apache.org 
Cc: Kevin Gurney ; Fiona La ; 
Jeremy Hughes ; Nick Haddad 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

Hi,

> 1. A developer would author a custom MEX function that
> uses C++ "building blocks" (i.e. classes and header files)
> from the object dispatch layer "framework". They would
> link their custom MEX function against a helper shared
> library that is built from the source code of the object
> dispatch layer and provides the symbols/implementation for
> the aforementioned C++ "building blocks".

How do we install the object dispatch layer to use it in
apache/arrow? I assumed that something like the following:


$ git clone 
https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git>
$ cd object-dispatch-layer
$ cmake -S . -B build ...
$ cmake --build build
$ cmake --install build
$ git clone 
https://github.com/apache/arrow.git<https://github.com/apache/arrow.git>
$ cd apache/matlab
$ cmake -S . -B build # This find installed object-dispatch-layer
$ cmake --build build
$ cmake --install build


My assumption is right?

> Essentially, for a developer to use the object dispatch
> layer, they will need to author a fair amount of custom
> code which makes use of both MATLAB and C++ "building
> blocks" from the "framework".

I think that this is a normal library usage. For example,
our S3 filesystem module implementation in C++ has about
2500 lines and uses classes provides by AWS SDK C++:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc<https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc>


> If we had to go through the IP Clearance Process, would
> that mean we would need to repeatedly clear the code every
> time we wanted to

Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2022-06-08 Thread Kevin Gurney
Hi Kou,

---

Note: I am replying to your email as a forward from Fiona (Cc'd) since your 
original email was accidentally blocked by my email client).

---

The way that we expected the object dispatch layer to be used by client code is 
as follows:

1. A developer would author a custom MEX function that uses C++ "building 
blocks" (i.e. classes and header files) from the object dispatch layer 
"framework". They would link their custom MEX function against a helper shared 
library that is built from the source code of the object dispatch layer and 
provides the symbols/implementation for the aforementioned C++ "building 
blocks".

2. The object dispatch layer expects the compiled MEX function to have a 
specific name and be available on the MATLAB Search Path [1] so that it can be 
used by the MATLAB side of the object dispatch layer.

3. Once the MEX function is available on the MATLAB Search Path, client MATLAB 
code can use a set of MATLAB "building blocks" (i.e. classes), which are part 
of the object dispatch layer "framework", to connect a MATLAB class with a 
corresponding C++ class.

Essentially, for a developer to use the object dispatch layer, they will need 
to author a fair amount of custom code which makes use of both MATLAB and C++ 
"building blocks" from the "framework".

It's not clear to me whether the steps described above classify as "library 
usage" with regard to the IP Clearance Process.

If we had to go through the IP Clearance Process, would that mean we would need 
to repeatedly clear the code every time we wanted to sync up the git submodule 
with the latest source code from the external repository? It seems like this 
would quickly become impractical since we anticipate the need to iterate 
frequently on the object dispatch layer early on.

It's quite possible that I am not answering your questions completely, so 
please let me know if anything is unclear. My apologies in advance for any 
confusion.

[1] 
https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html

Best,

Kevin Gurney

____
From: Fiona La 
Sent: Wednesday, June 8, 2022 11:24 AM
To: Kevin Gurney 
Subject: FW: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX






From: Sutou Kouhei 
Date: Tuesday, June 7, 2022 at 8:36 PM
To: dev@arrow.apache.org 
Cc: Fiona La , Jeremy Hughes , 
Nick Haddad 
Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
objects using MEX

Hi,

Can we use the object dispatch layer as a library? Or should
we copy (or submodule) the object dispatch layer to
apache/arrow?

If we can use the object dispatch layer as a library, we can
just use it as an external library like GoogleTest. We don't
need IP clearance. You can use any Apache License 2.0
compatible license for the object dispatch layer.

Thanks,
--
kou

In 

"[MATLAB] Integrating a framework for connecting MATLAB and C++ objects using 
MEX" on Tue, 7 Jun 2022 18:10:43 +,
Kevin Gurney  wrote:

> Hi All,
>
> I am reaching out to seek guidance from the community regarding a code 
> integration puzzle.
>
> The architecture that we are currently pursuing for the MATLAB interface to 
> Arrow [1] involves dispatching to the Arrow C++ libraries using MEX (a MATLAB 
> facility for calling C/C++ code [2]). A major challenge with this approach 
> has been keeping Arrow C++ objects (e.g. arrow::Array) alive in memory for 
> the appropriate amount of time and making it easy to interface with them from 
> MATLAB.
>
> MATLAB has a recommended solution for this problem [3]. However, we've been 
> pursuing a MEX-based solution due to the pervasiveness of MEX and its 
> familiarity to MATLAB users. Our hope is that using MEX will make it easy for 
> others to contribute to the MATLAB interface.
>
> To help maintain the connection between MATLAB objects and C++, we've been 
> experimenting with a MEX-based object dispatch layer. The primary goal of 
> this work is to unblock development of the MATLAB interface to Arrow. 
> However, this object dispatch layer is non-trivial and ultimately unrelated 
> to the Arrow project's core mission. Therefore, submitting this code to the 
> Arrow project doesn't seem like the optimal code integration strategy.
>
> We’ve been considering the possibility of creating a new open-source 
> repository under the MathWorks GitHub organization [4] to host the object 
> dispatch layer (a side effect of this approach is that it may help encourage 
> reuse of this infrastructure in future open-source MATLAB projects).
>
> However, this approach would come with notable tradeoffs:
>
> 1. We would need to follow the ASF IP Clearance Process [5] to integrate this 
> code into the Arrow project (it's possible we are mistaken abou

[MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX

2022-06-07 Thread Kevin Gurney
Hi All,

I am reaching out to seek guidance from the community regarding a code 
integration puzzle.

The architecture that we are currently pursuing for the MATLAB interface to 
Arrow [1] involves dispatching to the Arrow C++ libraries using MEX (a MATLAB 
facility for calling C/C++ code [2]). A major challenge with this approach has 
been keeping Arrow C++ objects (e.g. arrow::Array) alive in memory for the 
appropriate amount of time and making it easy to interface with them from 
MATLAB.

MATLAB has a recommended solution for this problem [3]. However, we've been 
pursuing a MEX-based solution due to the pervasiveness of MEX and its 
familiarity to MATLAB users. Our hope is that using MEX will make it easy for 
others to contribute to the MATLAB interface.

To help maintain the connection between MATLAB objects and C++, we've been 
experimenting with a MEX-based object dispatch layer. The primary goal of this 
work is to unblock development of the MATLAB interface to Arrow. However, this 
object dispatch layer is non-trivial and ultimately unrelated to the Arrow 
project's core mission. Therefore, submitting this code to the Arrow project 
doesn't seem like the optimal code integration strategy.

We’ve been considering the possibility of creating a new open-source repository 
under the MathWorks GitHub organization [4] to host the object dispatch layer 
(a side effect of this approach is that it may help encourage reuse of this 
infrastructure in future open-source MATLAB projects).

However, this approach would come with notable tradeoffs:

1. We would need to follow the ASF IP Clearance Process [5] to integrate this 
code into the Arrow project (it's possible we are mistaken about this).

2. It's not obvious how we should keep the code in sync. Would it be possible 
to use a git submodule [6] to "symlink" to the external repo?

3. What about licensing? Does the code need to be Apache licensed, or would it 
be possible to use another Apache-compatible license [7], like BSD? BSD is the 
default choice for new projects hosted under the MathWorks GitHub organization.

Admittedly, we aren't sure what the best path forward is, so we appreciate the 
community's guidance. We welcome any suggestions.

[1] https://github.com/apache/arrow/tree/master/matlab
[2] https://www.mathworks.com/help/matlab/call-mex-files-1.html
[3] 
https://www.mathworks.com/help/matlab/build-matlab-interface-to-c-library.html
[4] https://github.com/mathworks
[5] https://incubator.apache.org/ip-clearance/
[6] https://github.blog/2016-02-01-working-with-submodules/
[7] https://www.apache.org/legal/resolved.html#category-a

Thank you,

Kevin Gurney


Re: Support for Co-authored-by tag on individual commits when integrating pull requests

2021-08-04 Thread Kevin Gurney
Hi Wes,

Thank you for the quick response!

No need to apologize! The Co-authored-by workflow is new to us, so we are 
learning what works as we go.

In terms of adding Fiona's name to the pull request that's already been 
integrated, we appreciate your consideration, but understand if this is too 
difficult to fix in the main branch at this point.

To prevent this issue from occurring in the future, we will open a pull request 
to modify the merge_arrow_pr.py script to scrape "Co-authored-by" tags as 
suggested.

Thank you!

Kevin


From: Wes McKinney 
Sent: Wednesday, August 4, 2021 11:02 AM
To: dev 
Cc: Fiona La 
Subject: Re: Support for Co-authored-by tag on individual commits when 
integrating pull requests

hi Kevin,

Unfortunately, I don't think it's possible to amend the existing
commit logs because that would require force-pushing the main branch.
I suppose we could revert the commit and push a new commit with the
commit message fixed.

> We realized after the pull request was integrated that Fiona may have gotten 
> credit if she pushed at least one commit from a separate GitHub account. 
> Although, we aren't 100% sure if this true.

Indeed, if Fiona's e-mail address was in the git Author field for any
commit in the PR, the PR merge script would have added a
"Co-authored-by:" message to the squashed commit message.

I think the next step here is to modify the PR merge script to scrape
any "Co-authored-by:" lines from the individual commit messages so
they can all be listed in the combined PR message.

Sorry about this, this is the first incidence of this particular issue
occurring to my knowledge.

Thanks
Wes

On Wed, Aug 4, 2021 at 9:46 AM Kevin Gurney  wrote:
>
> Hi All,
>
> Fiona La (Cc'd) and I recently worked together with Kou to integrate some 
> changes to the MATLAB interface (pull request: 
> https://github.com/apache/arrow/pull/10614<https://github.com/apache/arrow/pull/10614>).
>  Fiona and I pair programmed the implementation together on "one machine", 
> using my GitHub account to push commits. We used GitHub's support for 
> Co-authored-by tags 
> (https://docs.github.com/en/github/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors<https://docs.github.com/en/github/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors>)
>  to include Fiona's name on every commit. We thought this would be sufficient 
> to ensure that her name was included in the main Apache Arrow git history 
> after the commits were squashed and integrated by Kou. Unfortunately, it 
> looks like her name was dropped from the list of Co-authors during 
> integration.
>
> In order to ensure that all contributors to the project get credit:
>
> 1. Is there an existing, recommended best practice for pair programming on 
> pull requests that ensures all contributors get credit?
> * We realized after the pull request was integrated that Fiona may have 
> gotten credit if she pushed at least one commit from a separate GitHub 
> account. Although, we aren't 100% sure if this true.
> 2. It looks like 
> https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py<https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py>
>  does not support the Co-authored-by tag workflow on individual commits 
> described above.
> * We are interested in opening a pull request to modify merge_arrow_pr.py to 
> add support for this workflow.
> 3. Is there a way to retroactively add Fiona's name to the git history for 
> https://github.com/apache/arrow/pull/10614<https://github.com/apache/arrow/pull/10614>
>  so she receives credit?
>
> Thank you!
>
> Kevin Gurney


Support for Co-authored-by tag on individual commits when integrating pull requests

2021-08-04 Thread Kevin Gurney
Hi All,

Fiona La (Cc'd) and I recently worked together with Kou to integrate some 
changes to the MATLAB interface (pull request: 
https://github.com/apache/arrow/pull/10614). Fiona and I pair programmed the 
implementation together on "one machine", using my GitHub account to push 
commits. We used GitHub's support for Co-authored-by tags 
(https://docs.github.com/en/github/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors)
 to include Fiona's name on every commit. We thought this would be sufficient 
to ensure that her name was included in the main Apache Arrow git history after 
the commits were squashed and integrated by Kou. Unfortunately, it looks like 
her name was dropped from the list of Co-authors during integration.

In order to ensure that all contributors to the project get credit:

  1.  Is there an existing, recommended best practice for pair programming on 
pull requests that ensures all contributors get credit?
 *   We realized after the pull request was integrated that Fiona may have 
gotten credit if she pushed at least one commit from a separate GitHub account. 
Although, we aren't 100% sure if this true.
  2.  It looks like 
https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py does not 
support the Co-authored-by tag workflow on individual commits described above.
 *   We are interested in opening a pull request to modify 
merge_arrow_pr.py to add support for this workflow.
  3.  Is there a way to retroactively add Fiona's name to the git history for 
https://github.com/apache/arrow/pull/10614 so she receives credit?

Thank you!

Kevin Gurney


Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-02-09 Thread Kevin Gurney
Hi All,

Just a friendly reminder that today is the soft feedback deadline mentioned in 
my previous email for providing feedback on the MATLAB interface design doc ( 
https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg
 ). Please feel free to comment on the design doc with any questions or 
concerns you have.

As feedback settles, we'll start shifting our focus towards implementation. 
However, the design doc will continue to remain open to community input. We'll 
work on migrating it to Markdown so that it can live in GitHub for long term 
archival purposes and easier collaboration.

If you have any questions, don't hesitate to let me know.

Thank you!

Best Regards,

Kevin Gurney

From: Kevin Gurney 
Sent: Tuesday, February 2, 2021 5:05 PM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray ; Wes McKinney 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

Hi All,

Thanks again for all of the feedback we have received so far on the design doc 
- it's been really helpful.

Fiona (Cc'd) and I took an initial pass at addressing the comments from Wes and 
Antoine. @Wes and @Antoine - whenever you get a chance, we would appreciate it 
if you could take a look over our responses to your feedback. Don't hesitate to 
let us know if you have any additional questions or concerns.

Some notable changes we have made based on the feedback received so far:

1. Added a section about the ArrowC Data Interface and In-Process Arrow 
Memory Sharing.
2. Modified the Out-of-Process Arrow Memory Sharing example to use the 
Arrow IPC File Format rather than Feather V2.
3. Added a note that Plasma is informally deprecated, and it may not make 
sense to invest effort in supporting it right now.

We continue to welcome anyone else in the community to add your thoughts and 
comments to the design doc.

To keep things moving along, we would like to set a "soft" deadline for 
feedback on this high level design doc for next **Tuesday, February 9th, 
2021**. After this "soft" deadline, everyone is still free to raise concerns as 
they come up, but we'll transition towards focusing on initial implementation 
(i.e. we'll assume it is reasonable to start prototyping and opening pull 
requests as appropriate).

Note: We realize that Word Online isn't turning out to be an ideal solution for 
collaborating on design docs. The lack of support for including names with 
comments is proving to be cumbersome (our apologies, again). In addition, for 
long term archival purposes, it would be best to move the document to a better 
location. Fiona and I believe that using Markdown to write future design docs 
and user guides would be preferrable, as we can get all the benefits of normal 
GitHub version control, as well as have the docs live right alongside the 
MATLAB code. Commenting and general collaboration should be easier this way 
too. Our plan is to migrate this initial design doc to Markdown after feedback 
from the community has settled.

Thanks again to everyone for your help in getting this project off the ground!

Best Regards,

Kevin Gurney
________
From: Kevin Gurney 
Sent: Tuesday, January 26, 2021 11:47 AM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray ; Wes McKinney 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

Hi Wes,

Thanks very much for taking the time to share your feedback!

Looking forward to incorporating more feedback from the community. Excited to 
work together to move this project forward!

Best Regards,

Kevin Gurney

From: Wes McKinney 
Sent: Monday, January 25, 2021 4:48 PM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

hi Kevin -- I read through the document. It seems plenty reasonable to
me. Look forward to seeing the buildout.

Thanks
Wes

On Mon, Jan 25, 2021 at 3:10 PM Kevin Gurney  wrote:
>
> Hi Antoine,
>
> Thanks very much for taking a first pass over the document! I'll start 
> working through the feedback you've provided soon.
>
> It's useful to have a variety of perspectives here, including from Arrow C++ 
> developers, like yourself. Also, I agree - having more MATLAB users provide 
> feedback on the document would be great to help ensure we are covering all 
> necessary requirements to make the interface as useful as possible.
>
> If anyone has any other ideas or suggestions, please don't hesitate to share 
> your feedback on the document.
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Antoine Pitrou 
> Sent: Monday, January 25,

Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-02-02 Thread Kevin Gurney
Hi All,

Thanks again for all of the feedback we have received so far on the design doc 
- it's been really helpful.

Fiona (Cc'd) and I took an initial pass at addressing the comments from Wes and 
Antoine. @Wes and @Antoine - whenever you get a chance, we would appreciate it 
if you could take a look over our responses to your feedback. Don't hesitate to 
let us know if you have any additional questions or concerns.

Some notable changes we have made based on the feedback received so far:

1. Added a section about the ArrowC Data Interface and In-Process Arrow 
Memory Sharing.
2. Modified the Out-of-Process Arrow Memory Sharing example to use the 
Arrow IPC File Format rather than Feather V2.
3. Added a note that Plasma is informally deprecated, and it may not make 
sense to invest effort in supporting it right now.

We continue to welcome anyone else in the community to add your thoughts and 
comments to the design doc.

To keep things moving along, we would like to set a "soft" deadline for 
feedback on this high level design doc for next **Tuesday, February 9th, 
2021**. After this "soft" deadline, everyone is still free to raise concerns as 
they come up, but we'll transition towards focusing on initial implementation 
(i.e. we'll assume it is reasonable to start prototyping and opening pull 
requests as appropriate).

Note: We realize that Word Online isn't turning out to be an ideal solution for 
collaborating on design docs. The lack of support for including names with 
comments is proving to be cumbersome (our apologies, again). In addition, for 
long term archival purposes, it would be best to move the document to a better 
location. Fiona and I believe that using Markdown to write future design docs 
and user guides would be preferrable, as we can get all the benefits of normal 
GitHub version control, as well as have the docs live right alongside the 
MATLAB code. Commenting and general collaboration should be easier this way 
too. Our plan is to migrate this initial design doc to Markdown after feedback 
from the community has settled.

Thanks again to everyone for your help in getting this project off the ground!

Best Regards,

Kevin Gurney
________
From: Kevin Gurney 
Sent: Tuesday, January 26, 2021 11:47 AM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray ; Wes McKinney 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

Hi Wes,

Thanks very much for taking the time to share your feedback!

Looking forward to incorporating more feedback from the community. Excited to 
work together to move this project forward!

Best Regards,

Kevin Gurney

From: Wes McKinney 
Sent: Monday, January 25, 2021 4:48 PM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

hi Kevin -- I read through the document. It seems plenty reasonable to
me. Look forward to seeing the buildout.

Thanks
Wes

On Mon, Jan 25, 2021 at 3:10 PM Kevin Gurney  wrote:
>
> Hi Antoine,
>
> Thanks very much for taking a first pass over the document! I'll start 
> working through the feedback you've provided soon.
>
> It's useful to have a variety of perspectives here, including from Arrow C++ 
> developers, like yourself. Also, I agree - having more MATLAB users provide 
> feedback on the document would be great to help ensure we are covering all 
> necessary requirements to make the interface as useful as possible.
>
> If anyone has any other ideas or suggestions, please don't hesitate to share 
> your feedback on the document.
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Antoine Pitrou 
> Sent: Monday, January 25, 2021 12:53 PM
> To: dev@arrow.apache.org ; Kevin Gurney 
> 
> Cc: Jeremy Hughes ; Nick Haddad 
> ; Penny Anderson ; Fiona La 
> ; Tahsin Hassan ; Yann Debray 
> 
> Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow
>
>
> Hi Kevin,
>
> I've added some comments to the document. Bear in mind that I'm not a
> MATLAB user, so this is some outside view from a Arrow C++ developer.
>
> It would be nice if MATLAB users could comment on the document,
> especially the proposed APIs.
>
> Regards
>
> Antoine.
>
>
> Le 22/01/2021 à 23:19, Kevin Gurney a écrit :
> > Hi Antoine,
> >
> > Thanks for your input!
> >
> > As you pointed out, I am in fact familiar with the matlab/ directory! :-) 
> > Several MathWorkers, including myself, helped contribute to this code a 
> > while back. We are hoping to use it as a starting point as we build out a 
> > more fully fledged MATLAB interface to Arro

Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-01-26 Thread Kevin Gurney
Hi Wes,

Thanks very much for taking the time to share your feedback!

Looking forward to incorporating more feedback from the community. Excited to 
work together to move this project forward!

Best Regards,

Kevin Gurney

From: Wes McKinney 
Sent: Monday, January 25, 2021 4:48 PM
To: dev 
Cc: Antoine Pitrou ; Jeremy Hughes ; 
Nick Haddad ; Penny Anderson ; 
Fiona La ; Tahsin Hassan ; Yann 
Debray 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

hi Kevin -- I read through the document. It seems plenty reasonable to
me. Look forward to seeing the buildout.

Thanks
Wes

On Mon, Jan 25, 2021 at 3:10 PM Kevin Gurney  wrote:
>
> Hi Antoine,
>
> Thanks very much for taking a first pass over the document! I'll start 
> working through the feedback you've provided soon.
>
> It's useful to have a variety of perspectives here, including from Arrow C++ 
> developers, like yourself. Also, I agree - having more MATLAB users provide 
> feedback on the document would be great to help ensure we are covering all 
> necessary requirements to make the interface as useful as possible.
>
> If anyone has any other ideas or suggestions, please don't hesitate to share 
> your feedback on the document.
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Antoine Pitrou 
> Sent: Monday, January 25, 2021 12:53 PM
> To: dev@arrow.apache.org ; Kevin Gurney 
> 
> Cc: Jeremy Hughes ; Nick Haddad 
> ; Penny Anderson ; Fiona La 
> ; Tahsin Hassan ; Yann Debray 
> 
> Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow
>
>
> Hi Kevin,
>
> I've added some comments to the document. Bear in mind that I'm not a
> MATLAB user, so this is some outside view from a Arrow C++ developer.
>
> It would be nice if MATLAB users could comment on the document,
> especially the proposed APIs.
>
> Regards
>
> Antoine.
>
>
> Le 22/01/2021 à 23:19, Kevin Gurney a écrit :
> > Hi Antoine,
> >
> > Thanks for your input!
> >
> > As you pointed out, I am in fact familiar with the matlab/ directory! :-) 
> > Several MathWorkers, including myself, helped contribute to this code a 
> > while back. We are hoping to use it as a starting point as we build out a 
> > more fully fledged MATLAB interface to Arrow memory.
> >
> > Based on your suggestion, I've included a link to a Word Online version of 
> > the design document below:
> >
> > https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg<https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg><https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg<https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg>>
> >
> > As far as I can tell, this link should allow commenting by anyone. 
> > Unfortunately, I'm not sure if the names of reviewers will be included when 
> > they comment. If this turns out to be the case, it would be great if 
> > reviewers could prefix their comments with something like [FirstName 
> > LastName] so we can track feedback appropriately.
> >
> > Don't hesitate to let me know if you have any issues accessing or 
> > commenting on the document. My apologies for the inconvenience in getting 
> > this properly shared.
> >
> > Best Regards,
> >
> > Kevin Gurney
> > 
> > From: Antoine Pitrou 
> > Sent: Friday, January 22, 2021 11:28 AM
> > To: dev@arrow.apache.org ; Kevin Gurney 
> > 
> > Cc: Jeremy Hughes ; Nick Haddad 
> > ; Penny Anderson ; Fiona La 
> > ; Tahsin Hassan ; Yann Debray 
> > 
> > Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow
> >
> >
> > Hello Kevin,
> >
> > You could use a Google Docs or similar to share the design document and
> > allow people to comment. Inside a Google Doc, you can use "File ->
> > Share" to create a sharable URL with specific permissions (such as
> > commenting but not editing).
> >
> > I was about to mention the matlab/ directory in the Arrow repository but
> > I see you're the main author, so you already know about it :-)
> >
> > Best regards
> >
> > Antoine.
> >
> >
> > Le 22/01/2021 à 16:05, Kevin Gurney a écrit :
> >> It seems like the mailing list stripped out the design doc I attached for 
> >> some reason.
> >>
> >> Here is a link to the same document hosted online instead:
> >>
> >> https://mathworks-my.sharep

Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-01-25 Thread Kevin Gurney
Hi Antoine,

Thanks very much for taking a first pass over the document! I'll start working 
through the feedback you've provided soon.

It's useful to have a variety of perspectives here, including from Arrow C++ 
developers, like yourself. Also, I agree - having more MATLAB users provide 
feedback on the document would be great to help ensure we are covering all 
necessary requirements to make the interface as useful as possible.

If anyone has any other ideas or suggestions, please don't hesitate to share 
your feedback on the document.

Best Regards,

Kevin Gurney

From: Antoine Pitrou 
Sent: Monday, January 25, 2021 12:53 PM
To: dev@arrow.apache.org ; Kevin Gurney 

Cc: Jeremy Hughes ; Nick Haddad ; 
Penny Anderson ; Fiona La ; Tahsin 
Hassan ; Yann Debray 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow


Hi Kevin,

I've added some comments to the document. Bear in mind that I'm not a
MATLAB user, so this is some outside view from a Arrow C++ developer.

It would be nice if MATLAB users could comment on the document,
especially the proposed APIs.

Regards

Antoine.


Le 22/01/2021 à 23:19, Kevin Gurney a écrit :
> Hi Antoine,
>
> Thanks for your input!
>
> As you pointed out, I am in fact familiar with the matlab/ directory! :-) 
> Several MathWorkers, including myself, helped contribute to this code a while 
> back. We are hoping to use it as a starting point as we build out a more 
> fully fledged MATLAB interface to Arrow memory.
>
> Based on your suggestion, I've included a link to a Word Online version of 
> the design document below:
>
> https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg<https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg>
>
> As far as I can tell, this link should allow commenting by anyone. 
> Unfortunately, I'm not sure if the names of reviewers will be included when 
> they comment. If this turns out to be the case, it would be great if 
> reviewers could prefix their comments with something like [FirstName 
> LastName] so we can track feedback appropriately.
>
> Don't hesitate to let me know if you have any issues accessing or commenting 
> on the document. My apologies for the inconvenience in getting this properly 
> shared.
>
> Best Regards,
>
> Kevin Gurney
> 
> From: Antoine Pitrou 
> Sent: Friday, January 22, 2021 11:28 AM
> To: dev@arrow.apache.org ; Kevin Gurney 
> 
> Cc: Jeremy Hughes ; Nick Haddad 
> ; Penny Anderson ; Fiona La 
> ; Tahsin Hassan ; Yann Debray 
> 
> Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow
>
>
> Hello Kevin,
>
> You could use a Google Docs or similar to share the design document and
> allow people to comment. Inside a Google Doc, you can use "File ->
> Share" to create a sharable URL with specific permissions (such as
> commenting but not editing).
>
> I was about to mention the matlab/ directory in the Arrow repository but
> I see you're the main author, so you already know about it :-)
>
> Best regards
>
> Antoine.
>
>
> Le 22/01/2021 à 16:05, Kevin Gurney a écrit :
>> It seems like the mailing list stripped out the design doc I attached for 
>> some reason.
>>
>> Here is a link to the same document hosted online instead:
>>
>> https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4<https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4><https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4<https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4>>
>>
>> Note: This link is only a temporary solution (will expire on February 21, 
>> 2021). It would be ideal if we could move this to a better place like the 
>> Arrow Confluence Design Documents area.
>>
>> Thanks,
>>
>> Kevin
>> 
>> From: Kevin Gurney 
>> Sent: Thursday, January 21, 2021 4:47 PM
>> To: dev@arrow.apache.org 
>> Cc: Jeremy Hughes ; Nick Haddad 
>> ; Penny Anderson ; Fiona La 
>> ; Tahsin Hassan ; Yann Debray 
>> 
>> Subject: [MATLAB] Developing a MATLAB Interface for Apache Arrow
>>
>> Hello All,
>>
>> MathWorks is interested in collaborating with the rest of the Arrow 
>> community to build out a MATLAB interface to Arrow memory. We envision an 
>> interface analogous to the other language bindings, with packaged classes 
>> and functions like:
>>
>> * arrow.A

Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-01-22 Thread Kevin Gurney
Hi Antoine,

Thanks for your input!

As you pointed out, I am in fact familiar with the matlab/ directory! :-) 
Several MathWorkers, including myself, helped contribute to this code a while 
back. We are hoping to use it as a starting point as we build out a more fully 
fledged MATLAB interface to Arrow memory.

Based on your suggestion, I've included a link to a Word Online version of the 
design document below:

https://mathworks-my.sharepoint.com/:w:/p/kgurney/EcNXJh5S-HBCit-YNL6ZYnEB4Mv9ZPTVEs7a72SWlywIsg

As far as I can tell, this link should allow commenting by anyone. 
Unfortunately, I'm not sure if the names of reviewers will be included when 
they comment. If this turns out to be the case, it would be great if reviewers 
could prefix their comments with something like [FirstName LastName] so we can 
track feedback appropriately.

Don't hesitate to let me know if you have any issues accessing or commenting on 
the document. My apologies for the inconvenience in getting this properly 
shared.

Best Regards,

Kevin Gurney

From: Antoine Pitrou 
Sent: Friday, January 22, 2021 11:28 AM
To: dev@arrow.apache.org ; Kevin Gurney 

Cc: Jeremy Hughes ; Nick Haddad ; 
Penny Anderson ; Fiona La ; Tahsin 
Hassan ; Yann Debray 
Subject: Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow


Hello Kevin,

You could use a Google Docs or similar to share the design document and
allow people to comment. Inside a Google Doc, you can use "File ->
Share" to create a sharable URL with specific permissions (such as
commenting but not editing).

I was about to mention the matlab/ directory in the Arrow repository but
I see you're the main author, so you already know about it :-)

Best regards

Antoine.


Le 22/01/2021 à 16:05, Kevin Gurney a écrit :
> It seems like the mailing list stripped out the design doc I attached for 
> some reason.
>
> Here is a link to the same document hosted online instead:
>
> https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4<https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4>
>
> Note: This link is only a temporary solution (will expire on February 21, 
> 2021). It would be ideal if we could move this to a better place like the 
> Arrow Confluence Design Documents area.
>
> Thanks,
>
> Kevin
> 
> From: Kevin Gurney 
> Sent: Thursday, January 21, 2021 4:47 PM
> To: dev@arrow.apache.org 
> Cc: Jeremy Hughes ; Nick Haddad 
> ; Penny Anderson ; Fiona La 
> ; Tahsin Hassan ; Yann Debray 
> 
> Subject: [MATLAB] Developing a MATLAB Interface for Apache Arrow
>
> Hello All,
>
> MathWorks is interested in collaborating with the rest of the Arrow community 
> to build out a MATLAB interface to Arrow memory. We envision an interface 
> analogous to the other language bindings, with packaged classes and functions 
> like:
>
> * arrow.Array
> * arrow.TableReader
> * arrow.type.Float64
> * ...
>
> In the past, several MathWorkers worked with the Arrow community to develop a 
> proof-of-concept MATLAB interface for reading/writing Feather V1 files by 
> leveraging the Arrow C++ libraries. Since then, the Arrow project has evolved 
> considerably, and we'd like to work with the community to expand MATLAB's 
> ability to interoperate with the broader Arrow ecosystem.
>
> Attached to this email is a lightweight design document which lays out a 
> high-level direction for these development efforts. We welcome any and all 
> feedback on this document.
>
> It would be great to move this design document to some place that is more 
> easily accessible and publicly archived for all members of the Arrow 
> community. At first glance, the Arrow Confluence Design Documents area 
> (https://cwiki.apache.org/confluence/display/ARROW/Design+Documents<https://cwiki.apache.org/confluence/display/ARROW/Design+Documents>)
>  seems like the ideal place. However, if you have other suggestions of how 
> best to collaborate on this document, please let me know.
>
> We are excited to work together with the rest of the Arrow community to make 
> this a reality.
>
> Best Regards,
>
> Kevin Gurney
>


Re: [MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-01-22 Thread Kevin Gurney
It seems like the mailing list stripped out the design doc I attached for some 
reason.

Here is a link to the same document hosted online instead:

https://mathworks-my.sharepoint.com/:b:/p/kgurney/EU3Kdz0cubRJrkEyI1bNR88BKnH4S2siU2EHHNQwxTgHUg?e=wzLDx4

Note: This link is only a temporary solution (will expire on February 21, 
2021). It would be ideal if we could move this to a better place like the Arrow 
Confluence Design Documents area.

Thanks,

Kevin

From: Kevin Gurney 
Sent: Thursday, January 21, 2021 4:47 PM
To: dev@arrow.apache.org 
Cc: Jeremy Hughes ; Nick Haddad ; 
Penny Anderson ; Fiona La ; Tahsin 
Hassan ; Yann Debray 
Subject: [MATLAB] Developing a MATLAB Interface for Apache Arrow

Hello All,

MathWorks is interested in collaborating with the rest of the Arrow community 
to build out a MATLAB interface to Arrow memory. We envision an interface 
analogous to the other language bindings, with packaged classes and functions 
like:

  *   arrow.Array
  *   arrow.TableReader
  *   arrow.type.Float64
  *   ...

In the past, several MathWorkers worked with the Arrow community to develop a 
proof-of-concept MATLAB interface for reading/writing Feather V1 files by 
leveraging the Arrow C++ libraries. Since then, the Arrow project has evolved 
considerably, and we'd like to work with the community to expand MATLAB's 
ability to interoperate with the broader Arrow ecosystem.

Attached to this email is a lightweight design document which lays out a 
high-level direction for these development efforts. We welcome any and all 
feedback on this document.

It would be great to move this design document to some place that is more 
easily accessible and publicly archived for all members of the Arrow community. 
At first glance, the Arrow Confluence Design Documents area 
(https://cwiki.apache.org/confluence/display/ARROW/Design+Documents) seems like 
the ideal place. However, if you have other suggestions of how best to 
collaborate on this document, please let me know.

We are excited to work together with the rest of the Arrow community to make 
this a reality.

Best Regards,

Kevin Gurney


[MATLAB] Developing a MATLAB Interface for Apache Arrow

2021-01-21 Thread Kevin Gurney
Hello All,

MathWorks is interested in collaborating with the rest of the Arrow community 
to build out a MATLAB interface to Arrow memory. We envision an interface 
analogous to the other language bindings, with packaged classes and functions 
like:

  *   arrow.Array
  *   arrow.TableReader
  *   arrow.type.Float64
  *   ...

In the past, several MathWorkers worked with the Arrow community to develop a 
proof-of-concept MATLAB interface for reading/writing Feather V1 files by 
leveraging the Arrow C++ libraries. Since then, the Arrow project has evolved 
considerably, and we'd like to work with the community to expand MATLAB's 
ability to interoperate with the broader Arrow ecosystem.

Attached to this email is a lightweight design document which lays out a 
high-level direction for these development efforts. We welcome any and all 
feedback on this document.

It would be great to move this design document to some place that is more 
easily accessible and publicly archived for all members of the Arrow community. 
At first glance, the Arrow Confluence Design Documents area 
(https://cwiki.apache.org/confluence/display/ARROW/Design+Documents) seems like 
the ideal place. However, if you have other suggestions of how best to 
collaborate on this document, please let me know.

We are excited to work together with the rest of the Arrow community to make 
this a reality.

Best Regards,

Kevin Gurney


Re: Arrow sync call Wednesday 17:00 UTC / 12p US-Eastern

2019-02-05 Thread Kevin Gurney
Hi Wes,

Could you please add me to the recurring calendar invite?

Thank you!

Kevin Gurney


From: Wes McKinney 
Sent: Tuesday, February 5, 2019 12:44 PM
To: dev@arrow.apache.org
Subject: Arrow sync call Wednesday 17:00 UTC / 12p US-Eastern

As usual we will meet at https://meet.google.com/vtm-teks-phx

If anyone would like to be added to the recurring calendar invite,
please let me know and I'll add you



RE: Git workflow question

2019-01-31 Thread Kevin Gurney
Hi All,

In case it is helpful for others, I wanted to summarize the high-level workflow 
that we have been following at MathWorks to manage our mathworks/arrow fork of 
apache/arrow.

We use the mathworks/arrow fork as a sort of "staging" area, where we can 
experiment, perform preliminary code review, and prepare pull requests that 
will eventually be shared with the upstream apache/arrow project. Although we 
have multiple contributors to the mathworks/arrow fork, the steps we have been 
following should more or less work for personal forks too.

In general, mathworks/arrow follows the "Branch and Pull" workflow described 
here: http://www.goring.org/resources/project-management.html

Although we are still continuously refining this workflow based on experience 
and feedback, it seems to work pretty well for our purposes.

The basic process we follow is described below:

1. Clone the mathworks/arrow fork to your local machine.

$ git clone https://github.com/mathworks/arrow.git

2. Set up a remote to point to the upstream apache/arrow repository.

$ cd arrow/
$ git remote add apache https://github.com/apache/arrow.git

3. Sync the mathworks/arrow:master branch with the upstream apache/arrow:master 
branch.

$ git pull --ff-only apache master
$ git push

4. Create a new feature branch off the up-to-date mathworks/arrow:master 
branch. This branch will eventually be used for making a pull request against 
the upstream apache/arrow:master branch. All pull requests should be associated 
with an existing Apache JIRA Issue. We recommend naming your feature branch 
after the associated Apache JIRA Issue. For example, for an Apache JIRA issue 
named ARROW-1234, you could name your branch arrow_1234.

$ git checkout -b arrow_1234
$ git push --set-upstream origin arrow_1234

5. If mathworks/arrow:master moves ahead (because it is re-synced with the 
upstream apache/arrow:master branch), then rebase the feature branch with the 
mathworks/arrow:master branch. This will ensure that any commits made on the 
feature branch will be developed "on top" of the latest mathworks/arrow:master 
commit history.

$ git rebase master

NOTE: We use a "work in progress" branch as detailed in steps 6 and 8 for 
managing preliminary code reviews in mathworks/arrow. Therefore, these steps 
may not be necessary if you are managing a personal fork.

6. Create a "work in progress" branch named _wip (where wip = "work in 
progress"). This branch will be used as a "staging" area for doing iterative 
feature development.

$ git checkout -b arrow_1234_wip
$ git push --set-upstream origin arrow_1234_wip

7. Do work on the _wip branch. Commit as little or as often as you 
like. Push the changes to mathworks/arrow:_wip as often as you feel is 
appropriate.

$ git commit -am "Fix issue with ..."
$ git push

8. When you are ready, create a preliminary code review by making a pull 
request from the _wip branch to the corresponding  branch. 
Once the code review is complete, accept the pull request into the  
branch.

9. Once the  branch is finalized, make a pull request from the 
mathworks/arrow: branch against the upstream apache/arrow:master 
branch. Collaborate with the rest of the Arrow community to address feedback on 
your changes.

10. If you see any CI failures, inspect the Travis CI and/or AppVeyor logs to 
determine whether the failures are occurring due to your changes or some 
unrelated recent commits to apache/arrow:master. If the failures are occurring 
due to your changes, you can make any necessary changes on your local clone of 
the mathworks/arrow: branch. Pushing this branch to mathworks/arrow 
will re-run CI jobs.

11. If CI failures continue to occur which appear unrelated to your pull 
request, add a comment to your pull request which mentions this and wait for 
the CI build of apache/arrow:master to start passing again (the status of the 
CI build is displayed as a badge on the Arrow README page). Once the master CI 
build of Arrow is passing again, rebase your  branch with the upstream 
apache/arrow:master branch (this will also automatically re-run CI):

$ git checkout master
$ git pull --ff-only apache master
$ git push
$ git checkout arrow_1234
$ git rebase master
$ git push --force

NOTE: Wes has pointed out that GitHub will not notify reviewers if you force 
push commits (with git push --force). You should add a comment to your pull 
request whenever force pushing to inform reviewers of any new changes.

---

We would welcome any feedback on this workflow, and if others think it would be 
useful, I would be happy to contribute a generic version of these process 
details to the Apache Arrow contribution guidelines.

Best Regards,

Kevin Gurney

-Original Message-
From: Ravindra Pindikura  
Sent: Wednesday, January 30, 2019 8:52 PM
To: dev@arrow.apache.org
Subject: Re: Git workflow question

Ok. Thanks, wes.

> On Jan 30, 2019,

[jira] [Created] (ARROW-3896) Decouple MATLAB-Arrow conversion logic from Feather file specific logic

2018-11-28 Thread Kevin Gurney (JIRA)
Kevin Gurney created ARROW-3896:
---

 Summary: Decouple MATLAB-Arrow conversion logic from Feather file 
specific logic
 Key: ARROW-3896
 URL: https://issues.apache.org/jira/browse/ARROW-3896
 Project: Apache Arrow
  Issue Type: Improvement
  Components: MATLAB
Reporter: Kevin Gurney
 Fix For: 0.13.0


Currently, the logic for converting between a MATLAB mxArray and various Arrow 
data structures (arrow::Table, arrow::Array, etc.) is tightly coupled and 
fairly tangled up with the logic specific to handling Feather files. It would 
be helpful to factor out these conversions into a more generic "mlarrow" 
conversion layer component so that it can be reused in the future for use cases 
other than Feather support. Furthermore, this would be helpful to enforce a 
cleaner separation of concerns.

It would be nice to start off with this refactoring work up front before adding 
support for more datatypes to the MATLAB featherread/featherwrite functions, so 
that we can start off with a clean base upon which to expand moving forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Making a bugfix 0.11.1 release

2018-10-18 Thread Kevin Gurney
Hi Antoine,


Thanks for the quick response!


This helps to clear up my confusion.


Best Regards,


Kevin Gurney


From: Antoine Pitrou 
Sent: Thursday, October 18, 2018 9:54:47 AM
To: dev@arrow.apache.org
Subject: Re: Making a bugfix 0.11.1 release


Le 18/10/2018 à 15:44, Kevin Gurney a écrit :
> Hi All,
>
> We are working with the arrow version 0.9.0 C++ libraries in conjunction with 
> separate parquet-cpp version 1.4.0.
>
> Questions:
>
>   1.  Does this zlib issue affect all clients of the arrow C++ libraries or 
> just the Python PyArrow code?

To be clear: this is a packaging issue and only affects the PyArrow
binary wheels (i.e. if you type "pip install pyarrow").  It should not
affect people who self-compile Arrow or PyArrow; and it probably doesn't
affect people who download other binaries, either (such as Conda packages).

Regards

Antoine.



Re: Making a bugfix 0.11.1 release

2018-10-18 Thread Kevin Gurney
Hi All,

We are working with the arrow version 0.9.0 C++ libraries in conjunction with 
separate parquet-cpp version 1.4.0.

Questions:

  1.  Does this zlib issue affect all clients of the arrow C++ libraries or 
just the Python PyArrow code?
  2.  Does this zlib compression issue also affect the arrow version 0.9.0 C++ 
libraries (before parquet-cpp was merged in), or only the latest arrow version 
0.11.0 C++ libraries (with parquet-cpp merged in)?

Best Regards,

Kevin Gurney


From: Krisztián Szűcs 
Sent: Thursday, October 18, 2018 5:31:01 AM
To: dev@arrow.apache.org
Subject: Re: Making a bugfix 0.11.1 release

I've added the two zlib issues to 0.11.1 version:
https://issues.apache.org/jira/projects/ARROW/versions/12344316

On Wed, Oct 17, 2018 at 10:51 PM Wes McKinney  wrote:

> Got it, thank you for clarifying. It wasn't clear whether the bug
> would occur in the build environment (CentOS 5 + devtoolset-2) as well
> as other Linux environments.
> On Wed, Oct 17, 2018 at 4:16 PM Antoine Pitrou  wrote:
> >
> >
> > Le 17/10/2018 à 20:38, Wes McKinney a écrit :
> > > hi folks,
> > >
> > > Since the Python wheels are being installed 10,000 times per day or
> > > more, I don't think we should allow them to be broken for much longer.
> > >
> > > What additional patches need to be done before an RC can be cut? Since
> > > I'm concerned about the broken patches undermining the project's
> > > reputation, I can adjust my priorities to start a release vote later
> > > today or first thing tomorrow morning. Seems like
> > > https://issues.apache.org/jira/browse/ARROW-3535 might be the last
> > > item, and I can prepare a maintenance branch with the cherry-picked
> > > fixes
> > >
> > > Was there a determination as to why our CI systems did not catch the
> > > blocker ARROW-3514?
> >
> > Because it was not exercised by the test suite.  My take is that the bug
> > would only happen with specific data, e.g. tiny and/or entirely
> > incompressible.  I don't think general gzip compression of Parquet files
> > was broken.
> >
> > Regards
> >
> > Antoine.
>


[jira] [Created] (ARROW-2750) Add MATLAB support for reading numeric types from Feather files

2018-06-26 Thread Kevin Gurney (JIRA)
Kevin Gurney created ARROW-2750:
---

 Summary: Add MATLAB support for reading numeric types from Feather 
files
 Key: ARROW-2750
 URL: https://issues.apache.org/jira/browse/ARROW-2750
 Project: Apache Arrow
  Issue Type: New Feature
Affects Versions: 0.10.0
 Environment: Tested on Debian 9, Windows 10, and macOS 10.13.
Reporter: Kevin Gurney
 Fix For: 0.10.0


Add MATLAB support for reading numeric types (i.e. [u]int[x][y], float, double) 
from Feather files. This is the first in a series of future feature submissions 
for Feather read/write support and other Arrow IPC integration with MATLAB.

The associated pull request creates a top-level "matlab" directory in the 
Apache Arrow project. It also introduces a MATLAB function "featherread", which 
takes a Feather filename as input and returns a MATLAB table. featherread maps 
Feather datatypes to corresponding MATLAB datatypes.

This initial pull request does not support null values.

featherread.m calls the Arrow C++ APIs using MEX, MATLAB's facility for calling 
C/C++ code.

See the README.md in the "matlab" directory for instructions on how to build 
the MEX interface. Currently, building on Windows using CMake is not fully 
functional, but the MEX interface can be compiled manually using the MATLAB 
"mex" command.

A MATLAB install is required to be present on your machine to build the MEX 
interface. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)