I agree with the "Jira first" approach, and the specification vs implementation distinction, and we should try to separate those discussions.
As others have already mentioned, I think we should try to follow a set of basic rules or guidelines to try to enforce this policy, e.g.: - We should make an effort to create complete, well-described Jira tickets, and discuss specification details in there. - Whenever a specification/conceptual discussion arises in a PR, it should be moved into the Jira; we should keep PR reviews for implementation-only discussions. - We should use the "discussion in jira" label on the PR in these cases (and remove it only when it has been clarified). - Before merging a patch, the committer should (must) check not only that there are no pending remarks on the PR, but also verify that there is no question or comment unaddressed in the Jira ticket. If we agree on these set of rules (or others), perhaps we should update our contributing guidelines [1] to reflect them in there. Nevertheless, apart from documenting them somewhere (which is important), the key thing would be to enforce in practice. Best, Ruben [1] https://calcite.apache.org/develop/#contributing On Wed, May 21, 2025 at 9:19 AM Alessandro Solimando < alessandro.solima...@gmail.com> wrote: > Hello, > as a developer and user of Calcite, I have always appreciated the quality > of the Jira tickets when researching information, it's easier to understand > the specification there instead of trying to infer it from discussions in > the code, or the code itself. > > Another great benefit is that sometimes discussions in the ML/Jira, being > more conceptual, foster references to academic literature, design choices > of other databases etc., I have learned a lot about other systems that way. > > I also agree that, sometimes, conceptual discussions arise during code > review, but I see no problem in proposing to move back to Jira until those > points are settled. > I have proposed that to contributors a few times, especially when I am > unsure on how to move forward, as there are higher chances to get an extra > opinion and visibility in Jira rather than in PR reviews (I personally try > to at least skim all tickets, but I don't always have the time to check > PRs). > > In some extreme cases, it's even good to pause and move the discussion to > the ML, and resume the specification/implementation when the issue has been > clarified. > > I feel that the "discussion in jira" label could help whenever there is > some pending discussion in Jira should be resolved before finalizing the > code, the label should be added to the PR, and committers should then make > sure the discussion is settled before merging. I feel it's easier/safer if > the person raising the discussion in Jira adds the label to the PR > themselves, as it's easy to miss a discussion in Jira in case the message > is written when you have already started looking at the code, and this is > especially true for people reviewing a lot of PRs (kudos to them!). > > IMO, if we want specifications to be valued, there needs to be a rewarding > scheme associated. In OSS project it will be requiring people to craft good > Jira titles/descriptions before merging their changes (as you already > proposed), but also making such aspects crucial in evaluating to invite new > committers, they should demonstrate good code contributions, but also the > ability to produce clear tickets and specifications, for their own > contributions, and that of others, acting themselves as autonomous quality > gatekeepers (for code, specs, jira handling, etc.), because this is what we > will expect from them anyway later on. > > Best regards, > Alessandro > > On Tue, 20 May 2025 at 21:03, Julian Hyde <jhyde.apa...@gmail.com> wrote: > > > > Mihai wrote: > > > > > > I find the JIRA search difficult to use compared with github search. > > > As the number of issues grows, it's harder and harder to sift > > > through them. > > > > Jira search isn’t great, but it isn’t terrible. The ability to > > cross-reference cases (with ‘related’ links) is more powerful than > GitHub’s > > search. > > > > One pattern I use is to start in the code (it helps a lot if the test > case > > is in the right place) and then navigate back to Jira via git blame. > > > > > Also, after the PR sometimes design questions creep on github > > > too as part of the review process. It's not easy to have a dialogue > > > that alternates between the two sites - no causality is maintained. > > > > I agree, the parallel conversations are difficult. I don’t have an easy > > answer. But I strongly believe that there should be a “specification” > > thread to the conversation and an “implementation” thread (or threads). > > GitHub handles the “implementation” threads well, because people just > > comment on the relevant line of code. (Except where the problem with the > > implementation is a line of code that was NOT written.) > > > > I think that reviewers should strenuously force specification topics over > > to Jira. > > > > > A third problem is that we have lots of contributors for whom > > > English is a second language (I am one of them), and sometimes > > > the code is easier to read than the spec... > > > > I call bullshit on that one. As a native English speaker, I can tell you > > that writing specifications is hard for me too. The first five years of > my > > career I worked in UK and all of my colleagues were British. There were > bad > > engineers too lazy to write specifications, and sometimes they worked for > > bad managers who let them get away with it. > > > > I know you are being kind to contributors from other cultures. But I > don’t > > think we are doing contributors any favors by lowering our standards. > > (Certainly we are doing our reviewers and users no favors.) > > > > We just need to create a culture where specifications are valued. > > > > Julian > > > > > > > On May 20, 2025, at 11:44 AM, Mihai Budiu <mbu...@gmail.com> wrote: > > > > > > It's a good policy. > > > > > > But I find the JIRA search difficult to use compared with github > search. > > > As the number of issues grows, it's harder and harder to sift through > > them. > > > > > > Also, after the PR sometimes design questions creep on github too as > > part of the review process. It's not easy to have a dialogue that > > alternates between the two sites - no causality is maintained. > > > > > > A third problem is that we have lots of contributors for whom English > is > > a second language (I am one of them), and sometimes the code is easier to > > read than the spec... > > > > > > Mihai > > > > > > ________________________________ > > > From: Julian Hyde <jh...@apache.org> > > > Sent: Tuesday, May 20, 2025 11:34 AM > > > To: dev@calcite.apache.org <dev@calcite.apache.org> > > > Subject: [DISCUSS] Jira first > > > > > > Calcite has always been a "Jira first" project, where all significant > > > commits have a Jira case number (CALCITE-nnnn). We've not allowed > > > patch attachments since the very early days, so each of those commits > > > also has a GitHub pull request (PR). > > > > > > Given that discussion can occur on the Jira case and the PR, does it > > > matter where that discussion occurs? In my opinion, it makes a great > > > deal of difference. > > > > > > In engineering, it is essential to separate the specification of a > > > change (bug or feature request) from implementation. The specification > > > is what that change does, and the implementation is how it does it. > > > The specification can be understood by the end-user of the change > > > (often the user who writes SQL queries, but sometimes an engineer who > > > is using Calcite's public or private APIs), whereas an implementation > > > may include a brief description of an algorithm but is mainly just > > > code (and tests). > > > > > > Which is more important: specification or implementation? In my > > > opinion, specification is way more important. From a good description > > > of the problem, even a good one-line summary, an engineer can in most > > > cases create an implementation. The specification also serves > > > end-users (reading the release notes), it serves as documentation for > > > future users of the feature, and helps future maintainers figure out > > > how the project fits together. But if all we have is code, the only > > > way to understand what has been done is to read the code. This doesn't > > > scale. > > > > > > This has come up a couple of times recently. > > > > > > In https://issues.apache.org/jira/browse/CALCITE-7013 / > > > https://github.com/apache/calcite/pull/4374 there were discussions in > > > both the Jira and GitHub about whether this was even a desirable > > > change. Mihai ended up merging the PR even though I had said "This is > > > not a bug" in the Jira case. This is basically one committer > > > overriding (albeit unintentionally) another commiter's -1. > > > > > > In https://issues.apache.org/jira/browse/CALCITE-7029 / > > > https://github.com/apache/calcite/pull/4392 the summary is "Support > > > DPhyp to handle various join types", which is meaningless even to > > > someone like me who follows academic work on query optimization. > > > Jensen added a comment in the PR asking for a link to the paper where > > > the 'DPhyp' term was defined. (Thank you Jensen!) But really, all work > > > reviewing the PR should stop, until we have a good description in the > > > Jira case. > > > > > > I would like us to adopt two policies: > > > * A committer should not merge a PR until the Jira has a good summary > > > and description. > > > * Discussion in a PR about specification (what, as opposed to how) > > > should be moved to Jira or the dev list. > > > > > > (Personally, I will not even look at a PR until the Jira is in good > > > shape, but I don't expect most people would go that far.) > > > > > > Do people have comments on how we use Jira vs GitHub PRs, and how we > > > balance specification, implementation, and tests? > > > > > > Julian > > > > >