Hi,

> One open question about this approach: which GitHub
> repository should we use for hosting the MLTBX via GitHub
> Releases?
> 
> We don't think using the main apache/arrow GitHub Releases
> area is the right approach. So, would it make sense to
> create a separate "bridge" repository just for hosting the
> latest MLTBX files? Should this be an ASF associated
> repository like apache/arrow-matlab or would a MathWorks
> associated repository like mathworks/arrow-matlab be OK?
> We aren't sure what makes the most sense here, but welcome
> any suggestions.

We can use apache/arrow's GitHub Releases. The release
distribution document says that we can use GitHub as a
release platform:
https://infra.apache.org/release-distribution.html#other-platforms

apache/arrow doesn't use GitHub Releases yet but
apache/arrow-adbc and apache/arrow-flight-sql-postgresql
already use GitHub Releases. (We just use "gh release
upload" to upload our artifacts to GitHub Releases.)

BTW, how does File Exchange "Connecting to GitHub Repositories"?
https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub

Does it just use "polling"? Or do we need to install any
GitHub App, set secret variable or something on
apache/arrow? If the latter, we need to ask INFRA to do it.

If we use GitHub Releases on apache/arrow, we can use the
following workflow. We don't need to use JFrog.

1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
2. Release: Run a post release script that would:
2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN
2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z
2.3 Linked File Exchange entry will be automatically updated


> File Exchange entries have a "Version History" which
> includes release notes from the "backing" GitHub Releases
> area. So, this would probably be a sensible location to
> put the release notes.

We can use GitHub Releases as I said. But if we use GitHub
Releases, the release notes on GitHub Releases may include
not only the MATLAB interface but also all
implementations. It may not be useful for this use case.

FYI: The R bindings have their release notes under
https://arrow.apache.org/docs/r/ . See
https://arrow.apache.org/docs/r/news/ .

> Also, including MATLAB updates in
> Apache Arrow release blog posts
> (e.g. https://arrow.apache.org/blog/2023/11/01/14.0.0-release/)
> may also be helpful.

Yes. We should do it. :-)


Thanks,
-- 
kou

In 
<mn2pr05mb6496df713e917c66e30ab50cae...@mn2pr05mb6496.namprd05.prod.outlook.com>
  "Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB 
interface" on Wed, 8 Nov 2023 20:44:10 +0000,
  Kevin Gurney <kgur...@mathworks.com.INVALID> wrote:

> Hi Kou and Dewey,
> 
> Thank you very much for your very thorough and detailed responses to all of 
> our questions. This is extremely valuable feedback and the points that you 
> made make alot of sense.
> 
> Sarah and I talked this over a bit more and we think that sticking with the 
> overall apache/arrow project release cycle (i.e. stay in line with 15.0.0) 
> makes the most sense in the long term.
> 
> @Dewey - thanks very much for highlighting the pros and cons of creating a 
> separate repository. We also really appreciate the community being willing to 
> try and support our development needs. That being said, we think it is 
> probably best to stay in-model with the main apache/arrow release process for 
> the time being rather than creating a separate repository for the MATLAB 
> interface.
> 
> To address some related points and questions:
> 
>> Can we just mention "This is not stable yet!!!" in the documentation instead 
>> of using isolated version?
> 
> Yes. This is good point and we already have a disclaimer in the README.md [1] 
> for the MATLAB interface which says: "Warning The MATLAB interface is under 
> active development and should be considered experimental."
> 
>> It's better that we use CI for this like other binary packages such as 
>> .deb/.rpm/.wheel/.jar/...
> 
> This makes sense and we agree. We will follow up with PRs to add the 
> necessary MATLAB packaging scripts and CI workflow files.
> 
>> Does the MLTBX file include Apache Arrow C++ binaries too like .wheel/.jar?
> 
> Yes. The MLTBX file will package the Apache Arrow C++ binaries, similar to 
> the Java JARs / Python wheels.
> 
>> MATLAB doesn't provide the official package repository such as PyPI for 
>> Python and https://rubygems.org/ for Ruby, right?
> 
> The equivalent to pypi.org or rubygems.org for MATLAB would be the MathWorks 
> File Exchange [2].
> 
>> If the official package repository for MATLAB doesn't exist, JFrog is better 
>> because the MLTBX file will be large (Apache Arrow C++ binaries are large).
> 
> As noted above, the "official package repository" for MATLAB would be the 
> MathWorks File Exchange. File Exchange has tight integration with GitHub [3]. 
> When a new release is available in GitHub Releases, the associated File 
> Exchange entry will be automatically updated.
> 
> We believe we could leverage this integration between File Exchange and 
> GitHub Releases to automate the MATLAB interface release process. This 
> approach might look like:
> 
> 1. Upload MLTBX to JFrog Artifactory
> 2. Run a post release script that would:
> 2.1 Download MLTBX from JFrog Artifactory
> 2.2 Upload to GitHub Releases (e.g. apache/arrow-matlab - see discussion 
> below)
> 2.3 Linked File Exchange entry will be automatically updated
> 
> One open question about this approach: which GitHub repository should we use 
> for hosting the MLTBX via GitHub Releases?
> 
> We don't think using the main apache/arrow GitHub Releases area is the right 
> approach. So, would it make sense to create a separate "bridge" repository 
> just for hosting the latest MLTBX files? Should this be an ASF associated 
> repository like apache/arrow-matlab or would a MathWorks associated 
> repository like mathworks/arrow-matlab be OK? We aren't sure what makes the 
> most sense here, but welcome any suggestions.
> 
>> We may want to use the status page for it: 
>> https://arrow.apache.org/docs/status.html
> 
> Thanks for highlighting this. This makes sense, and we can follow up with a 
> PR to add MATLAB to the status page.
> 
>> How about creating https://arrow.apache.org/docs/matlab/ ? We can use Sphinx 
>> like the Python docs https://arrow.apache.org/docs/python/ or another 
>> documentation tools like the R docs https://arrow.apache.org/docs/r/ . If we 
>> use Sphinx, we can create 
>> https://github.com/apache/arrow/tree/main/docs/source/matlab/
> 
> This makes sense and eventually we want to have comprehensive documentation 
> in line with other language bindings using Sphinx. In addition to 
> comprehensive documentation, we were also hoping that we could host release 
> notes in a place that is easily accessible from the MLTBX download location. 
> File Exchange entries have a "Version History" which includes release notes 
> from the "backing" GitHub Releases area. So, this would probably be a 
> sensible location to put the release notes. Also, including MATLAB updates in 
> Apache Arrow release blog posts (e.g. 
> https://arrow.apache.org/blog/2023/11/01/14.0.0-release/) may also be helpful.
> 
> --
> 
> We really appreciate all of the community's guidance on navigating the 
> release process!
> 
> We will get started on integrating with the existing release tooling.
> 
> [1] https://github.com/apache/arrow/tree/main/matlab#status
> [2] https://www.mathworks.com/matlabcentral/fileexchange
> [3] https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub
> 
> Best Regards,
> 
> Kevin Gurney
> ________________________________
> From: Dewey Dunnington <de...@voltrondata.com.INVALID>
> Sent: Tuesday, November 7, 2023 8:53 PM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Cc: Sarah Gilmore <sgilm...@mathworks.com>; Lei Hou <lei...@mathworks.com>
> Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the 
> MATLAB interface
> 
> For argument's sake, I might suggest that the process you described in
> your initial note would probably work best in another repo: you would
> be able to iterate faster and release/version at your own pace. The
> flexibility you get from moving to a separate repo comes at the cost
> of extra responsibility: you have to set up your own CI, manage your
> own issues, and set up your own release verification scripts + release
> votes on the mailing list. Because you bind Arrow C++, you would have
> to take sufficient steps to ensure that the Arrow C++ developers are
> made aware of changes that break the Matlab bindings and vice versa
> (i.e., test against dev Arrow C++ in a CI job).
> 
> Setting up that infrastructure for apache/arrow-nanoarrow took ~a week
> of development time, and it now takes ~half a day to release a new
> version (it took more for the first few versions, and the matlab
> version has considerably higher complexity). Probably the biggest
> barrier to releasing from another repo is that you have to ensure a
> critical mass of PMC members can/will run your release verification
> script and vote.
> 
> I happen to feel that it's the PMC's/wider community's responsibility
> to help language binding contributors adopt a workflow that suits
> their needs. If active Matlab contributors agree that they want to
> release version 0.1 from another repo, (I feel that) we're here to
> help you do that. If the active contributors want to stay in
> apache/arrow, there is less flexibility about what you release and
> when; however, the release process is well-defined.
> 
> On Tue, Nov 7, 2023 at 8:43 PM Sutou Kouhei <k...@clear-code.com> wrote:
>>
>> Hi,
>>
>> > As a point of reference, we noticed that PyArrow is on
>> > version 14.0.0, but it feels "misleading" to say that the
>> > MATLAB interface is at version 14.0.0 when we haven't yet
>> > implemented or stabilized all core Arrow APIs.
>>
>> I can understand this but I suggest that we use the same
>> version as other packages in apache/arrow. Because:
>>
>> * Using isolated version increases release complexity.
>> * Using isolated version may introduce another
>> "misleading"/"confusion": For example, "the MATLAB
>> interface 1.0.0 uses Apache Arrow C++ 20.0.0" may be
>> misleading/confused:
>> * The MATLAB interface 1.0.0 doesn't use Apache Arrow C++
>> 1.0.0.
>> * It may be difficult to find the corresponding
>> Apache Arrow C++ version from the MATLAB interface
>> version.
>>
>> Can we just mention "This is not stable yet!!!" in the
>> documentation instead of using isolated version?
>>
>> We may want to use the status page for it:
>> https://arrow.apache.org/docs/status.html<https://arrow.apache.org/docs/status.html>
>>
>> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux
>>
>> It's better that we use CI for this like other binary
>> packages such as .deb/.rpm/.wheel/.jar/...
>>
>> If we release the MATLAB interface separately, which Apache
>> Arrow C++ version is used? If we release the MATALB
>> interface right now, is Apache Arrow C++ 14.0.0 (the latest
>> release) used or is Apache Arrow C++ main (not released yet)
>> used? The MATLAB interface on main will depend on Apache
>> Arrow C++ main, we may not be able to use the latest release
>> for the MATLAB interface on main.
>>
>> > 2. Combine all of the cross platform build artifacts into
>> > a single MLTBX file [1] for distribution
>>
>> Does the MLTBX file include Apache Arrow C++ binaries too
>> like .wheel/.jar?
>>
>> > 3. Host the MLTBX somewhere that is easliy accessible for download
>>
>> MATLAB doesn't provide the official package repository such
>> as PyPI for Python and https://rubygems.org/<https://rubygems.org> for Ruby, 
>> right?
>>
>> > 1. Is there a recommended location where we can host the MLTBX file? e.g. 
>> > GitHub Releases [2], JFrog [3], etc.?
>>
>> If the official package repository for MATLAB doesn't exist,
>> JFrog is better because the MLTBX file will be large (Apache
>> Arrow C++ binaries are large).
>>
>> > 2. Is there a recommended location for hosting release notes?
>>
>> How about creating 
>> https://arrow.apache.org/docs/matlab/<https://arrow.apache.org/docs/matlab> ?
>> We can use Sphinx like the Python docs
>> https://arrow.apache.org/docs/python/<https://arrow.apache.org/docs/python> 
>> or another
>> documentation tools like the R docs
>> https://arrow.apache.org/docs/r/<https://arrow.apache.org/docs/r> .
>> If we use Sphinx, we can create
>> https://github.com/apache/arrow/tree/main/docs/source/matlab/<https://github.com/apache/arrow/tree/main/docs/source/matlab>
>> .
>>
>> > 3. Is there a recommended cadence for incremental point releases?
>>
>> I suggest avoiding separated release as above.
>>
>> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new 
>> > release proposal) that we should be aware of as we consider creating an 
>> > initial release?
>>
>> We don't need additional task for an initial release.
>>
>> > 5. How should the Arrow project release (i.e. 14.0.0)
>> > relate to the MATLAB interface version (i.e. 0.1)? As a
>> > point of reference, we noticed that PyArrow is on
>> > version 14.0.0, but it feels "misleading" to say that
>> > the MATLAB interface is at version 14.0.0 when we
>> > haven't yet implemented or stabilized all core Arrow
>> > APIs. Is there any precedent for using independent
>> > release versions for language bindings which are not
>> > fully stabilized and are also part of the main
>> > apache/arrow repository?
>>
>> We don't have any precedent for using independent release
>> versions for language bindings. All language bindings used
>> the same version.
>>
>> Apache Arrow JavaScript isn't a language bindings but it
>> used separated release and isolated versions before
>> 0.4.1. It joined apache/arrow release after 0.4.1. (The next
>> version of Apache Arrow JavaScript 0.4.1 is 13.0.0.)
>>
>> > We've noticed that Arrow-related projects which are not
>> > part of the main apache/arrow GitHub repository
>> > (e.g. DataFusion) follow a mailing list-based voting and
>> > release process. However, it's not clear whether it makes
>> > sense to follow this process for the MATLAB interface
>> > since it is part of the main apache/arrow repository.
>>
>> If we want to use separated release for the MATLAB
>> interface, we should follow the same release process as
>> apache/arrow and other apache/arrow-* because it's the
>> standard ASF release process.
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>> <mn2pr05mb649619998eae9579cceba692ae...@mn2pr05mb6496.namprd05.prod.outlook.com>
>> "[DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB 
>> interface" on Tue, 7 Nov 2023 20:31:31 +0000,
>> Kevin Gurney <kgur...@mathworks.com.INVALID> wrote:
>>
>> > Hi All,
>> >
>> > A considerable amount of new functionality has been added to the MATLAB 
>> > interface over the last few months. We appreciate all the community's 
>> > support in making this possible and are happy to see all the progress that 
>> > is being made.
>> >
>> > At this point, we would like to create an initial "0.1" release of the 
>> > MATLAB interface. Incremental point releases will enable MATLAB users to 
>> > provide early feedback. In addition, learning how to navigate the release 
>> > process is an important step towards eventually releasing a stable 1.0 
>> > version of the MATLAB interface.
>> >
>> > Our proposed approach to creating an initial release would be to:
>> >
>> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux
>> > 2. Combine all of the cross platform build artifacts into a single MLTBX 
>> > file [1] for distribution
>> > 3. Host the MLTBX somewhere that is easliy accessible for download
>> >
>> > For reference - MLTBX is a standard packaging format for MATLAB which 
>> > enables simple "one-click" installation - analogous to a Python pip 
>> > package or a Ruby gem.
>> >
>> > Creating an MLTBX file manually should be relatively low effort. However, 
>> > in the long term, we would love to enable semi-automated "push button" 
>> > releases via GitHub Actions (and possibly even "nightly builds").
>> >
>> > Since this is our first time creating a release of the MATLAB interface, 
>> > we wanted to draw on the community's expertise to answer a few questions:
>> >
>> > 1. Is there a recommended location where we can host the MLTBX file? e.g. 
>> > GitHub Releases [2], JFrog [3], etc.?
>> > 2. Is there a recommended location for hosting release notes?
>> > 3. Is there a recommended cadence for incremental point releases?
>> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new 
>> > release proposal) that we should be aware of as we consider creating an 
>> > initial release?
>> > 5. How should the Arrow project release (i.e. 14.0.0) relate to the MATLAB 
>> > interface version (i.e. 0.1)? As a point of reference, we noticed that 
>> > PyArrow is on version 14.0.0, but it feels "misleading" to say that the 
>> > MATLAB interface is at version 14.0.0 when we haven't yet implemented or 
>> > stabilized all core Arrow APIs. Is there any precedent for using 
>> > independent release versions for language bindings which are not fully 
>> > stabilized and are also part of the main apache/arrow repository?
>> >
>> > We've noticed that Arrow-related projects which are not part of the main 
>> > apache/arrow GitHub repository (e.g. DataFusion) follow a mailing 
>> > list-based voting and release process. However, it's not clear whether it 
>> > makes sense to follow this process for the MATLAB interface since it is 
>> > part of the main apache/arrow repository.
>> >
>> > We sincerely appreciate the community's help and guidance on this topic!
>> >
>> > Please let us know if you have any questions.
>> >
>> > [1] 
>> > https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
>> > [2] 
>> > https://github.com/apache/arrow/releases<https://github.com/apache/arrow/releases>
>> > [3] 
>> > https://apache.jfrog.io/ui/native/arrow/<https://apache.jfrog.io/ui/native/arrow>
>> > [4] 
>> > https://www.apache.org/foundation/voting.html<https://www.apache.org/foundation/voting.html>
>> > [5] 
>> > https://www.apache.org/legal/release-policy.html#release-approval<https://www.apache.org/legal/release-policy.html#release-approval>
>> >
>> > Best Regards,
>> >
>> > Kevin Gurney

Reply via email to