Yicong-Huang commented on issue #4227: URL: https://github.com/apache/texera/issues/4227#issuecomment-3976184351
Let’s have an informal discussion here first, and we can do an official vote if needed. My point is simple: **we should enforce an “issue first” policy.** In most open source projects, the issue tracker is the system of record for proposals, bugs, and planned work (todo). The expected flow is: open an issue, discuss and align, then implement via one or more PRs. In Spark and many earlier apache projects, issue refers to JIRA. In our context, by issue I am referring to "GitHub Issue" or "GitHub Discussion" (they are parallel). This mattered less when the project was small and centralized, and people could coordinate through face to face chats. But as the project grows and more contributors work async, that model breaks down. We need a shared, durable place for context and decisions. Issues provide that context. A single issue can capture the motivation, alternatives, and decisions, and then link to one or multiple PRs over time: initial attempt, revisions, follow ups, even reverts. Without an issue, a PR shows up with no background. Reviewers don’t know why it exists, what options were considered, or whether there were prior attempts. If a PR gets closed and a new one appears later without linkage, all the context is lost. It also hurts contributors. Without issues, it’s hard to see what work is in flight and where the project is headed, so people duplicate effort or build something that gets rejected simply because the direction was decided elsewhere. In the [last 25 merged PRs](https://github.com/apache/texera/pulls?q=is%3Apr+draft%3Afalse+is%3Aclosed+is%3Amerged), 15/25 had no linked issue. In the [latest 25 open PRs](https://github.com/apache/texera/pulls?q=is%3Apr+is%3Aopen+-is%3Adraft), 18/25 have no linked issue. For the current opening latest 25 PRs, 18 of them are not linked with an issue. I tried to review some of them last week but found myself have no context at all. An example is this [one](https://github.com/apache/texera/pull/4225), where the PR describes only dropping CI tests for Python 3.10 and 3.11 without providing any context or reasons. Another merged [PR](https://github.com/apache/texera/pull/4210) unfortunately only described the removal of the deprecated feature without providing history or reason. By enforcing "issue first" or "issue before PR", many of those problems can be solved. - Every PR has a clear “why” and a place to discuss alternatives. - Reviewers can catch context quickly and avoid repeating questions across PR iterations. - Contributors can browse issues to find work, avoid duplication, and align on direction early. There are extra benefits too (triage, planning, release notes), but that’s not the main point here. I understand this adds one extra step, especially for small fixes. But for a growing project, it’s a small cost that pays back quickly in reviewer time, fewer misunderstandings, and smoother collaboration. If we want a lightweight rule: bugs, features, and behavior changes must have an issue; truly trivial changes (typos, comment fixes) can be exempt. That is the main thing I wanted to enforce. And the enforcement of PR template + Github Action (the implementation in #4228) is just one way to enforce "issue first". I am fine if we use another method to enforce it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
