Hi,

I've been working through a proof-of-concept migration, moving specific
ARROW issues into a personal test repo on GitHub. I've made this repo
public to help demonstrate limitations, successes and options as we
consider this further:

https://github.com/toddfarmer/test_import/issues?q=is%3Aissue

These issues represent a progression; the most recently created will be
most representative of the current state. I intend to keep making tweaks to
test things. Some notes:

Jira and GitHub markdown differ.  There are conversion utilities that can
be used, or we could wrap the original content in a code block, treating
the Jira markdown as literal, quoted content. Some examples:

Converted: https://github.com/toddfarmer/test_import/issues/13
Wrapped in code block: https://github.com/toddfarmer/test_import/issues/10
Unmodified: https://github.com/toddfarmer/test_import/issues/7

This impacts a related concern. User identifiers differ between systems,
and straight conversion might result in mentions of completely unrelated
individuals (see the converted example above; jswenson in GitHub !=
jswenson in Jira). If we convert, we would need to at least do some
additional conversion to eliminate GitHub user references. If we don't
convert markdown, we still need to cleanse any non-code references that
start with "@". This was noted in the Spring migration, as Java annotations
(@Bean, @Configuration, etc.) in plain-text surprisingly mapped to GitHub
user account mentions. This is likely uncommon in ARROW tickets, but we
should make sure.

The username differences also means we cannot retain issue and comment
authorship, assignment or watching details during migration. An individual
who created an ASF Jira issue may be unaware of progress recorded only in a
post-migration GitHub issue. Similarly, somebody may self-assign a GitHub
issue, unaware that the ASF Jira issue it was based upon is being worked by
somebody else. These concerns can likely be mitigated to some extent with
comments, noting in the Jira ticket the migrated GitHub issue for further
work, and a comment in the GitHub issue if the source Jira issue was
assigned at time of migration.

Also on the topic of user accounts, we'll likely want to perform any
migration using a dedicated system account. Note that all test issues and
comments show as being created by me, again because there is no mapping of
user identifiers, but also because it may not be possible to specify the
authors during import. I've added notes to each migrated issue and comment
to specify the original author, with a link back to the ASJ Jira record.

Timestamps of issue creation, update and issue comments can be retained in
the migration, and the test issues reflect that.

As best I can tell, it is possible to indicate whether the issue is closed
or not during import to GitHub. It is not possible to specify the exact
status on import, and closed issues map to resolved/completed. That may be
a bit confusing for those ASF Jiras which were rejected/declined/etc. I'm
still working to see whether there are any meaningful workarounds to
distinguish resolution. It may require using a bespoke label, then doing a
batch update after import against issues having that label.

I was able to mapASF Jira Component values into labels. I have not yet
started looking at other metadata (e.g., fixed version, etc.).

Feedback and suggestions are welcome.

Thanks,

Todd







On Mon, Oct 24, 2022 at 11:25 AM Weston Pace <weston.p...@gmail.com> wrote:

> +1 for GH issues mainly because it lowers the barrier to entry and
> JIRA won't be an acceptable solution any longer with infra's proposed
> changes.  I suspect I'd be +1 even without the infra change though
> providing everyone else was willing to make the switch.
>
> On Mon, Oct 24, 2022 at 8:19 AM Jacob Wujciak
> <ja...@voltrondata.com.invalid> wrote:
> >
> > +1
> >
> > While there will be some work associated with migrating to Github Issues
> I
> > think it is the only viable solution that does not impose an untenable
> > burden on the PMC. Additionally I think that using gh issues will lower
> the
> > barrier for new contributions as experienced by arrow-rs. I don't think
> > another third-party tool is the solution as it would add maintenance
> burden
> > on the arrow community (I doubt INFRA will setup anything else in
> addition
> > to JIRA) with questionable value.
> >
> > I have no experience with github discussions but reading about it, it
> might
> > be a good replacement for the functions our issues currently have with a
> > more forum/board like format that might increase discoverability of
> > discussions. Issue template can now do more than just be prefilled with
> > text but actually act as forms: [1]
> >
> > > * Issue links: It seems that we can't do this.
> > Well we can mention #issue_number in the comment closing the issue and gh
> > issues now have two distinct closing states for done and
> > won't-fix/duplicate.
> >
> > [1]:
> >
> https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository#creating-issue-forms
> >
> > On Mon, Oct 24, 2022 at 8:56 AM Sutou Kouhei <k...@clear-code.com> wrote:
> >
> > > Hi,
> > >
> > > +1 on migration.
> > >
> > > > The one thing I would not want to lose, though, is the categorization
> > > > facilities we currently have in Jira. Namely: Component, Affects
> > > > version, Fix version, Type (bug/improvement/task...), Issue links
> > > > (superceded by/relates to/is caused by...), Priority (at least
> > > > Minor/Major/Blocker).
> > >
> > > I tried using some GitHub features.
> > >
> > > * Component: We already use GitHub's label feature
> > >   * e.g.: lang-c++:
> > >
> > >
> https://github.com/apache/arrow/pulls?q=is%3Apr+is%3Aopen+label%3Alang-c%2B%2B
> > > * Affects version: Create new labels such as "affect-10.0.0"?
> > > * Fix version: We can use GitHub's milestone feature
> > >   * I tried creating the "11.0.0" milestone:
> > >     https://github.com/apache/arrow/milestone/1
> > > * Type: GitHub's label feature or custom field in GitHub's
> > >   project feature?
> > >   * I tried creating a GitHub project for Apache Arrow:
> > >     https://github.com/orgs/apache/projects/148
> > >     * All Apache Arrow committers have Admin role. You can
> > >       change anything to learn GitHub's project feature.
> > > * Issue links: It seems that we can't do this.
> > > * Priority: GitHub's label feature or custom field in GitHub's
> > >   project feature?
> > >
> > >
> > > Thanks,
> > > --
> > > kou
> > >
> > > In <82d49482-706d-081b-32e7-f692bc282...@python.org>
> > >   "Re: [DISCUSS] Move issue tracking to <something>" on Sat, 22 Oct
> 2022
> > > 16:19:14 +0200,
> > >   Antoine Pitrou <anto...@python.org> wrote:
> > >
> > > >
> > > > Hi Neal,
> > > >
> > > > Le 22/10/2022 à 15:35, Neal Richardson a écrit :
> > > >> Their email says:
> > > >>
> > > >>> Infra knows this process change places an increasing burden on PMC
> > > >>> members
> > > >>> for managing contributors, and makes it harder for people to
> > > >>> contribute
> > > >> bug reports.
> > > >>> We suggest projects consider using GitHub Issues for
> customer-facing
> > > >> questions/bug
> > > >>> reports/etc., while maintaining development issues on Jira.
> > > >> but I think that having a two-tiered system for issue tracking
> > > >> presents
> > > >> some notable downsides for us, including:
> > > >> * Increased barriers to entry for new contributors and a sense of
> > > >> inequality between "us" and "them". There's already too much
> friction
> > > >> IMO,
> > > >> and this pushes that up significantly.
> > > >> * Maintenance burden of triaging and synchronizing issues across
> > > >> * trackers
> > > >> sounds like a lot for us to take on. I'd prefer the active
> maintainers
> > > >> on
> > > >> the project spend their time shipping useful, reliable software, not
> > > >> doing
> > > >> bookkeeping.
> > > >
> > > > I fully agree with your concerns.  So I'm +1 on migrating to
> > > > *something else*.
> > > >
> > > > The one thing I would not want to lose, though, is the categorization
> > > > facilities we currently have in Jira. Namely: Component, Affects
> > > > version, Fix version, Type (bug/improvement/task...), Issue links
> > > > (superceded by/relates to/is caused by...), Priority (at least
> > > > Minor/Major/Blocker).
> > > >
> > > > How much of that can be recreated in Github Issues, or any other
> > > > alternative?
> > > >
> > > > A secondary question is whether it's possible to migrate the current
> > > > issues. Would be nice to have, but not blocking either (IMHO).
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > >
>

Reply via email to