Hi Sara, El vie, 10 nov 2023 a las 18:48, Sarah Gilmore (<sgilm...@mathworks.com.invalid>) escribió: > > Hi Kou, > > > We can use apache/arrow's GitHub Releases. The release > > distribution document says that we can use GitHub as a > > release platform: > > https://infra.apache.org/release-distribution.html#other-platforms > > > > apache/arrow doesn't use GitHub Releases yet but > > apache/arrow-adbc and apache/arrow-flight-sql-postgresql > > already use GitHub Releases. (We just use "gh release > > upload" to upload our artifacts to GitHub Releases.) > > Thank you for clarifying that we can use apache/arrow's GitHub Releases area > for hosting the MLTBX file. We assumed we couldn't use the main repository, > but it's great to hear we can! > > > BTW, how does File Exchange "Connecting to GitHub Repositories"? > > https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub > > > > Does it just use "polling"? Or do we need to install any > > GitHub App, set secret variable or something on > > apache/arrow? If the latter, we need to ask INFRA to do it. > > We are currently consulting with the development team responsible for the > GitHub <-> File Exchange integration. We'll send a followup email with a > concrete answer once we know more. > > > If we use GitHub Releases on apache/arrow, we can use the > > following workflow. We don't need to use JFrog. > > > > 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN > > 2. Release: Run a post release script that would: > > 2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN > > 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z > > 2.3 Linked File Exchange entry will be automatically updated > > This seems like a much more streamlined approach. Not having to upload to > JFrog will make things easier. Thanks for the suggestion! > > To clarify, in step 1, would we upload the MLTBX to ursacomputing/crossbow's > GitHub Releases area [1]? Or, would we upload to apache/arrow's GitHub > Releases area? If we upload release candidates to apache/arrow's GitHub > Releases area, they would get automatically linked to the File Exchange. > Ideally, we wouldn't want users to download release candidates. >
Currently all the binaries are generated on the third step of the Release process [1] when we run `03-binary-submit.sh`. The crossbow job could build the MLTBX artifact and then when we do download the other binaries (`04-binary-download.sh`) we should also download the MTLBX and when we submit the rest to jfrog (`05-binary-upload.sh`) we could Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN. Once the release is approved and we do the post-release tasks to "officially" release, we would download the MLTBX and upload to the new GitHub Releases for apache-arrow-X.Y.Z this can be done as another step on our post-release tasks (post-xx-matlab.sh) [1] https://arrow.apache.org/docs/developers/release.html#build-source-and-binaries-and-submit-them > > We can use GitHub Releases as I said. But if we use GitHub > > Releases, the release notes on GitHub Releases may include > > not only the MATLAB interface but also all > > implementations. It may not be useful for this use case. > > > > FYI: The R bindings have their release notes under > > https://arrow.apache.org/docs/r/ . See > > https://arrow.apache.org/docs/r/news/ . > > We think it would still be useful to link to the GitHub release notes from > the File Exchange entry even if it includes notes for all language bindings. > The File Exchange <-> GitHub integration just includes a link to the GitHub > release notes under the Version History tab. If we find having a more focused > version of the release notes would be useful, then we can create a markdown > file analogous to the NEWS.md for the R bindings as you suggested (thanks or > pointing this out). > > [1] https://github.com/ursacomputing/crossbow/releases > > Thanks for all your help! > > Best, > > Sarah Gilmore > ________________________________ > From: Sutou Kouhei <k...@clear-code.com> > Sent: Thursday, November 9, 2023 7:50 PM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Cc: Sarah Gilmore <sgilm...@mathworks.com>; Lei Hou <lei...@mathworks.com> > Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the > MATLAB interface > > Hi, > > > One open question about this approach: which GitHub > > repository should we use for hosting the MLTBX via GitHub > > Releases? > > > > We don't think using the main apache/arrow GitHub Releases > > area is the right approach. So, would it make sense to > > create a separate "bridge" repository just for hosting the > > latest MLTBX files? Should this be an ASF associated > > repository like apache/arrow-matlab or would a MathWorks > > associated repository like mathworks/arrow-matlab be OK? > > We aren't sure what makes the most sense here, but welcome > > any suggestions. > > We can use apache/arrow's GitHub Releases. The release > distribution document says that we can use GitHub as a > release platform: > https://infra.apache.org/release-distribution.html#other-platforms<https://infra.apache.org/release-distribution.html#other-platforms> > > apache/arrow doesn't use GitHub Releases yet but > apache/arrow-adbc and apache/arrow-flight-sql-postgresql > already use GitHub Releases. (We just use "gh release > upload" to upload our artifacts to GitHub Releases.) > > BTW, how does File Exchange "Connecting to GitHub Repositories"? > https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub > > Does it just use "polling"? Or do we need to install any > GitHub App, set secret variable or something on > apache/arrow? If the latter, we need to ask INFRA to do it. > > If we use GitHub Releases on apache/arrow, we can use the > following workflow. We don't need to use JFrog. > > 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN > 2. Release: Run a post release script that would: > 2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN > 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z > 2.3 Linked File Exchange entry will be automatically updated > > > > File Exchange entries have a "Version History" which > > includes release notes from the "backing" GitHub Releases > > area. So, this would probably be a sensible location to > > put the release notes. > > We can use GitHub Releases as I said. But if we use GitHub > Releases, the release notes on GitHub Releases may include > not only the MATLAB interface but also all > implementations. It may not be useful for this use case. > > FYI: The R bindings have their release notes under > https://arrow.apache.org/docs/r/<https://arrow.apache.org/docs/r> . See > https://arrow.apache.org/docs/r/news/<https://arrow.apache.org/docs/r/news> . > > > Also, including MATLAB updates in > > Apache Arrow release blog posts > > (e.g. > > https://arrow.apache.org/blog/2023/11/01/14.0.0-release/<https://arrow.apache.org/blog/2023/11/01/14.0.0-release>) > > may also be helpful. > > Yes. We should do it. :-) > > > Thanks, > -- > kou > > In > <mn2pr05mb6496df713e917c66e30ab50cae...@mn2pr05mb6496.namprd05.prod.outlook.com> > "Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB > interface" on Wed, 8 Nov 2023 20:44:10 +0000, > Kevin Gurney <kgur...@mathworks.com.INVALID> wrote: > > > Hi Kou and Dewey, > > > > Thank you very much for your very thorough and detailed responses to all of > > our questions. This is extremely valuable feedback and the points that you > > made make alot of sense. > > > > Sarah and I talked this over a bit more and we think that sticking with the > > overall apache/arrow project release cycle (i.e. stay in line with 15.0.0) > > makes the most sense in the long term. > > > > @Dewey - thanks very much for highlighting the pros and cons of creating a > > separate repository. We also really appreciate the community being willing > > to try and support our development needs. That being said, we think it is > > probably best to stay in-model with the main apache/arrow release process > > for the time being rather than creating a separate repository for the > > MATLAB interface. > > > > To address some related points and questions: > > > >> Can we just mention "This is not stable yet!!!" in the documentation > >> instead of using isolated version? > > > > Yes. This is good point and we already have a disclaimer in the README.md > > [1] for the MATLAB interface which says: "Warning The MATLAB interface is > > under active development and should be considered experimental." > > > >> It's better that we use CI for this like other binary packages such as > >> .deb/.rpm/.wheel/.jar/... > > > > This makes sense and we agree. We will follow up with PRs to add the > > necessary MATLAB packaging scripts and CI workflow files. > > > >> Does the MLTBX file include Apache Arrow C++ binaries too like .wheel/.jar? > > > > Yes. The MLTBX file will package the Apache Arrow C++ binaries, similar to > > the Java JARs / Python wheels. > > > >> MATLAB doesn't provide the official package repository such as PyPI for > >> Python and https://rubygems.org/<https://rubygems.org> for Ruby, right? > > > > The equivalent to pypi.org or rubygems.org for MATLAB would be the > > MathWorks File Exchange [2]. > > > >> If the official package repository for MATLAB doesn't exist, JFrog is > >> better because the MLTBX file will be large (Apache Arrow C++ binaries are > >> large). > > > > As noted above, the "official package repository" for MATLAB would be the > > MathWorks File Exchange. File Exchange has tight integration with GitHub > > [3]. When a new release is available in GitHub Releases, the associated > > File Exchange entry will be automatically updated. > > > > We believe we could leverage this integration between File Exchange and > > GitHub Releases to automate the MATLAB interface release process. This > > approach might look like: > > > > 1. Upload MLTBX to JFrog Artifactory > > 2. Run a post release script that would: > > 2.1 Download MLTBX from JFrog Artifactory > > 2.2 Upload to GitHub Releases (e.g. apache/arrow-matlab - see discussion > > below) > > 2.3 Linked File Exchange entry will be automatically updated > > > > One open question about this approach: which GitHub repository should we > > use for hosting the MLTBX via GitHub Releases? > > > > We don't think using the main apache/arrow GitHub Releases area is the > > right approach. So, would it make sense to create a separate "bridge" > > repository just for hosting the latest MLTBX files? Should this be an ASF > > associated repository like apache/arrow-matlab or would a MathWorks > > associated repository like mathworks/arrow-matlab be OK? We aren't sure > > what makes the most sense here, but welcome any suggestions. > > > >> We may want to use the status page for it: > >> https://arrow.apache.org/docs/status.html<https://arrow.apache.org/docs/status.html> > > > > Thanks for highlighting this. This makes sense, and we can follow up with a > > PR to add MATLAB to the status page. > > > >> How about creating > >> https://arrow.apache.org/docs/matlab/<https://arrow.apache.org/docs/matlab> > >> ? We can use Sphinx like the Python docs > >> https://arrow.apache.org/docs/python/<https://arrow.apache.org/docs/python> > >> or another documentation tools like the R docs > >> https://arrow.apache.org/docs/r/<https://arrow.apache.org/docs/r/> . If we > >> use Sphinx, we can create > >> https://github.com/apache/arrow/tree/main/docs/source/matlab/<https://github.com/apache/arrow/tree/main/docs/source/matlab> > > > > This makes sense and eventually we want to have comprehensive documentation > > in line with other language bindings using Sphinx. In addition to > > comprehensive documentation, we were also hoping that we could host release > > notes in a place that is easily accessible from the MLTBX download > > location. File Exchange entries have a "Version History" which includes > > release notes from the "backing" GitHub Releases area. So, this would > > probably be a sensible location to put the release notes. Also, including > > MATLAB updates in Apache Arrow release blog posts (e.g. > > https://arrow.apache.org/blog/2023/11/01/14.0.0-release/<https://arrow.apache.org/blog/2023/11/01/14.0.0-release/>) > > may also be helpful. > > > > -- > > > > We really appreciate all of the community's guidance on navigating the > > release process! > > > > We will get started on integrating with the existing release tooling. > > > > [1] > > https://github.com/apache/arrow/tree/main/matlab#status<https://github.com/apache/arrow/tree/main/matlab#status> > > [2] https://www.mathworks.com/matlabcentral/fileexchange > > [3] https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub > > > > Best Regards, > > > > Kevin Gurney > > ________________________________ > > From: Dewey Dunnington <de...@voltrondata.com.INVALID> > > Sent: Tuesday, November 7, 2023 8:53 PM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Cc: Sarah Gilmore <sgilm...@mathworks.com>; Lei Hou <lei...@mathworks.com> > > Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of > > the MATLAB interface > > > > For argument's sake, I might suggest that the process you described in > > your initial note would probably work best in another repo: you would > > be able to iterate faster and release/version at your own pace. The > > flexibility you get from moving to a separate repo comes at the cost > > of extra responsibility: you have to set up your own CI, manage your > > own issues, and set up your own release verification scripts + release > > votes on the mailing list. Because you bind Arrow C++, you would have > > to take sufficient steps to ensure that the Arrow C++ developers are > > made aware of changes that break the Matlab bindings and vice versa > > (i.e., test against dev Arrow C++ in a CI job). > > > > Setting up that infrastructure for apache/arrow-nanoarrow took ~a week > > of development time, and it now takes ~half a day to release a new > > version (it took more for the first few versions, and the matlab > > version has considerably higher complexity). Probably the biggest > > barrier to releasing from another repo is that you have to ensure a > > critical mass of PMC members can/will run your release verification > > script and vote. > > > > I happen to feel that it's the PMC's/wider community's responsibility > > to help language binding contributors adopt a workflow that suits > > their needs. If active Matlab contributors agree that they want to > > release version 0.1 from another repo, (I feel that) we're here to > > help you do that. If the active contributors want to stay in > > apache/arrow, there is less flexibility about what you release and > > when; however, the release process is well-defined. > > > > On Tue, Nov 7, 2023 at 8:43 PM Sutou Kouhei <k...@clear-code.com> wrote: > >> > >> Hi, > >> > >> > As a point of reference, we noticed that PyArrow is on > >> > version 14.0.0, but it feels "misleading" to say that the > >> > MATLAB interface is at version 14.0.0 when we haven't yet > >> > implemented or stabilized all core Arrow APIs. > >> > >> I can understand this but I suggest that we use the same > >> version as other packages in apache/arrow. Because: > >> > >> * Using isolated version increases release complexity. > >> * Using isolated version may introduce another > >> "misleading"/"confusion": For example, "the MATLAB > >> interface 1.0.0 uses Apache Arrow C++ 20.0.0" may be > >> misleading/confused: > >> * The MATLAB interface 1.0.0 doesn't use Apache Arrow C++ > >> 1.0.0. > >> * It may be difficult to find the corresponding > >> Apache Arrow C++ version from the MATLAB interface > >> version. > >> > >> Can we just mention "This is not stable yet!!!" in the > >> documentation instead of using isolated version? > >> > >> We may want to use the status page for it: > >> https://arrow.apache.org/docs/status.html<https://arrow.apache.org/docs/status.html><https://arrow.apache.org/docs/status.html<https://arrow.apache.org/docs/status.html>> > >> > >> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux > >> > >> It's better that we use CI for this like other binary > >> packages such as .deb/.rpm/.wheel/.jar/... > >> > >> If we release the MATLAB interface separately, which Apache > >> Arrow C++ version is used? If we release the MATALB > >> interface right now, is Apache Arrow C++ 14.0.0 (the latest > >> release) used or is Apache Arrow C++ main (not released yet) > >> used? The MATLAB interface on main will depend on Apache > >> Arrow C++ main, we may not be able to use the latest release > >> for the MATLAB interface on main. > >> > >> > 2. Combine all of the cross platform build artifacts into > >> > a single MLTBX file [1] for distribution > >> > >> Does the MLTBX file include Apache Arrow C++ binaries too > >> like .wheel/.jar? > >> > >> > 3. Host the MLTBX somewhere that is easliy accessible for download > >> > >> MATLAB doesn't provide the official package repository such > >> as PyPI for Python and > >> https://rubygems.org/<https://rubygems.org/><https://rubygems.org<https://rubygems.org>> > >> for Ruby, right? > >> > >> > 1. Is there a recommended location where we can host the MLTBX file? > >> > e.g. GitHub Releases [2], JFrog [3], etc.? > >> > >> If the official package repository for MATLAB doesn't exist, > >> JFrog is better because the MLTBX file will be large (Apache > >> Arrow C++ binaries are large). > >> > >> > 2. Is there a recommended location for hosting release notes? > >> > >> How about creating > >> https://arrow.apache.org/docs/matlab/<https://arrow.apache.org/docs/matlab/><https://arrow.apache.org/docs/matlab<https://arrow.apache.org/docs/matlab>> > >> ? > >> We can use Sphinx like the Python docs > >> https://arrow.apache.org/docs/python/<https://arrow.apache.org/docs/python/><https://arrow.apache.org/docs/python<https://arrow.apache.org/docs/python>> > >> or another > >> documentation tools like the R docs > >> https://arrow.apache.org/docs/r/<https://arrow.apache.org/docs/r/><https://arrow.apache.org/docs/r<https://arrow.apache.org/docs/r>> > >> . > >> If we use Sphinx, we can create > >> https://github.com/apache/arrow/tree/main/docs/source/matlab/<https://github.com/apache/arrow/tree/main/docs/source/matlab/><https://github.com/apache/arrow/tree/main/docs/source/matlab<https://github.com/apache/arrow/tree/main/docs/source/matlab>> > >> . > >> > >> > 3. Is there a recommended cadence for incremental point releases? > >> > >> I suggest avoiding separated release as above. > >> > >> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new > >> > release proposal) that we should be aware of as we consider creating an > >> > initial release? > >> > >> We don't need additional task for an initial release. > >> > >> > 5. How should the Arrow project release (i.e. 14.0.0) > >> > relate to the MATLAB interface version (i.e. 0.1)? As a > >> > point of reference, we noticed that PyArrow is on > >> > version 14.0.0, but it feels "misleading" to say that > >> > the MATLAB interface is at version 14.0.0 when we > >> > haven't yet implemented or stabilized all core Arrow > >> > APIs. Is there any precedent for using independent > >> > release versions for language bindings which are not > >> > fully stabilized and are also part of the main > >> > apache/arrow repository? > >> > >> We don't have any precedent for using independent release > >> versions for language bindings. All language bindings used > >> the same version. > >> > >> Apache Arrow JavaScript isn't a language bindings but it > >> used separated release and isolated versions before > >> 0.4.1. It joined apache/arrow release after 0.4.1. (The next > >> version of Apache Arrow JavaScript 0.4.1 is 13.0.0.) > >> > >> > We've noticed that Arrow-related projects which are not > >> > part of the main apache/arrow GitHub repository > >> > (e.g. DataFusion) follow a mailing list-based voting and > >> > release process. However, it's not clear whether it makes > >> > sense to follow this process for the MATLAB interface > >> > since it is part of the main apache/arrow repository. > >> > >> If we want to use separated release for the MATLAB > >> interface, we should follow the same release process as > >> apache/arrow and other apache/arrow-* because it's the > >> standard ASF release process. > >> > >> > >> Thanks, > >> -- > >> kou > >> > >> In > >> <mn2pr05mb649619998eae9579cceba692ae...@mn2pr05mb6496.namprd05.prod.outlook.com> > >> "[DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB > >> interface" on Tue, 7 Nov 2023 20:31:31 +0000, > >> Kevin Gurney <kgur...@mathworks.com.INVALID> wrote: > >> > >> > Hi All, > >> > > >> > A considerable amount of new functionality has been added to the MATLAB > >> > interface over the last few months. We appreciate all the community's > >> > support in making this possible and are happy to see all the progress > >> > that is being made. > >> > > >> > At this point, we would like to create an initial "0.1" release of the > >> > MATLAB interface. Incremental point releases will enable MATLAB users to > >> > provide early feedback. In addition, learning how to navigate the > >> > release process is an important step towards eventually releasing a > >> > stable 1.0 version of the MATLAB interface. > >> > > >> > Our proposed approach to creating an initial release would be to: > >> > > >> > 1. Manually build the MATLAB interface on Windows, macOS, and Linux > >> > 2. Combine all of the cross platform build artifacts into a single MLTBX > >> > file [1] for distribution > >> > 3. Host the MLTBX somewhere that is easliy accessible for download > >> > > >> > For reference - MLTBX is a standard packaging format for MATLAB which > >> > enables simple "one-click" installation - analogous to a Python pip > >> > package or a Ruby gem. > >> > > >> > Creating an MLTBX file manually should be relatively low effort. > >> > However, in the long term, we would love to enable semi-automated "push > >> > button" releases via GitHub Actions (and possibly even "nightly builds"). > >> > > >> > Since this is our first time creating a release of the MATLAB interface, > >> > we wanted to draw on the community's expertise to answer a few questions: > >> > > >> > 1. Is there a recommended location where we can host the MLTBX file? > >> > e.g. GitHub Releases [2], JFrog [3], etc.? > >> > 2. Is there a recommended location for hosting release notes? > >> > 3. Is there a recommended cadence for incremental point releases? > >> > 4. Are there any notable ASF procedures [4] [5] (e.g. voting on a new > >> > release proposal) that we should be aware of as we consider creating an > >> > initial release? > >> > 5. How should the Arrow project release (i.e. 14.0.0) relate to the > >> > MATLAB interface version (i.e. 0.1)? As a point of reference, we noticed > >> > that PyArrow is on version 14.0.0, but it feels "misleading" to say that > >> > the MATLAB interface is at version 14.0.0 when we haven't yet > >> > implemented or stabilized all core Arrow APIs. Is there any precedent > >> > for using independent release versions for language bindings which are > >> > not fully stabilized and are also part of the main apache/arrow > >> > repository? > >> > > >> > We've noticed that Arrow-related projects which are not part of the main > >> > apache/arrow GitHub repository (e.g. DataFusion) follow a mailing > >> > list-based voting and release process. However, it's not clear whether > >> > it makes sense to follow this process for the MATLAB interface since it > >> > is part of the main apache/arrow repository. > >> > > >> > We sincerely appreciate the community's help and guidance on this topic! > >> > > >> > Please let us know if you have any questions. > >> > > >> > [1] > >> > https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav > >> > [2] > >> > https://github.com/apache/arrow/releases<https://github.com/apache/arrow/releases><https://github.com/apache/arrow/releases<https://github.com/apache/arrow/releases>> > >> > [3] > >> > https://apache.jfrog.io/ui/native/arrow/<https://apache.jfrog.io/ui/native/arrow><https://apache.jfrog.io/ui/native/arrow<https://apache.jfrog.io/ui/native/arrow>> > >> > [4] > >> > https://www.apache.org/foundation/voting.html<https://www.apache.org/foundation/voting.html><https://www.apache.org/foundation/voting.html<https://www.apache.org/foundation/voting.html>> > >> > [5] > >> > https://www.apache.org/legal/release-policy.html#release-approval<https://www.apache.org/legal/release-policy.html#release-approval><https://www.apache.org/legal/release-policy.html#release-approval<https://www.apache.org/legal/release-policy.html#release-approval>> > >> > > >> > Best Regards, > >> > > >> > Kevin Gurney