Yicong-Huang commented on issue #4227:
URL: https://github.com/apache/texera/issues/4227#issuecomment-3976184351

   Let’s have an informal discussion here first, and we can do an official vote 
if needed.
   
   My point is simple: **we should enforce an “issue first” policy.** In most 
open source projects, the issue tracker is the system of record for proposals, 
bugs, and planned work (todo). The expected flow is: open an issue, discuss and 
align, then implement via one or more PRs. In Spark and many earlier apache 
projects, issue refers to JIRA. In our context, by issue I am referring to 
"GitHub Issue" or "GitHub Discussion" (they are parallel).
   
   This mattered less when the project was small and centralized, and people 
could coordinate through face to face chats. But as the project grows and more 
contributors work async, that model breaks down. We need a shared, durable 
place for context and decisions.
   
   Issues provide that context. A single issue can capture the motivation, 
alternatives, and decisions, and then link to one or multiple PRs over time: 
initial attempt, revisions, follow ups, even reverts. Without an issue, a PR 
shows up with no background. Reviewers don’t know why it exists, what options 
were considered, or whether there were prior attempts. If a PR gets closed and 
a new one appears later without linkage, all the context is lost. It also hurts 
contributors. Without issues, it’s hard to see what work is in flight and where 
the project is headed, so people duplicate effort or build something that gets 
rejected simply because the direction was decided elsewhere.
   
   In the [last 25 merged 
PRs](https://github.com/apache/texera/pulls?q=is%3Apr+draft%3Afalse+is%3Aclosed+is%3Amerged),
 15/25 had no linked issue. In the [latest 25 open 
PRs](https://github.com/apache/texera/pulls?q=is%3Apr+is%3Aopen+-is%3Adraft), 
18/25 have no linked issue. For the current opening latest 25 PRs, 18 of them 
are not linked with an issue. I tried to review some of them last week but 
found myself have no context at all. An example is this 
[one](https://github.com/apache/texera/pull/4225), where the PR describes only 
dropping CI tests for Python 3.10 and 3.11 without providing any context or 
reasons. Another merged [PR](https://github.com/apache/texera/pull/4210) 
unfortunately only described the removal of the deprecated feature without 
providing history or reason.
   
   By enforcing "issue first" or "issue before PR", many of those problems can 
be solved. 
    - Every PR has a clear “why” and a place to discuss alternatives.
    - Reviewers can catch context quickly and avoid repeating questions across 
PR iterations.
    - Contributors can browse issues to find work, avoid duplication, and align 
on direction early.
   There are extra benefits too (triage, planning, release notes), but that’s 
not the main point here.
   
   I understand this adds one extra step, especially for small fixes. But for a 
growing project, it’s a small cost that pays back quickly in reviewer time, 
fewer misunderstandings, and smoother collaboration. If we want a lightweight 
rule: bugs, features, and behavior changes must have an issue; truly trivial 
changes (typos, comment fixes) can be exempt.
   
   That is the main thing I wanted to enforce. And the enforcement of PR 
template + Github Action (the implementation in #4228) is just one way to 
enforce "issue first". I am fine if we use another method to enforce it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to