areusch commented on PR #95:
URL: https://github.com/apache/tvm-rfcs/pull/95#issuecomment-1338415060

   I appreciate you guys bringing this back up, and I appreciate the goal here 
of establishing a project policy to ensure we're welcoming to new projects as 
the MLC landscape changes. I still have some reservations about this proposal 
in its current form, but I would support it if a small number of changes were 
made. Here is what I think it needs:
   
   1. **Approval mechanism**. I disagree with @Hzfengsy's statement that 
"establishing a scoped module is not a typical code change." Because the 
typical S0 module would reside inside `apache/tvm` on `main`, a subproject at 
the moment is essentially several PRs that add code to the codebase. This is a 
typical code change, so I don't see a reason to adopt Lazy Majority consensus 
here. I appreciate that Hive uses Lazy Majority, but Hive has [52 PMCs and 104 
committers](https://projects.apache.org/committee.html?hive), and also has a 
process by which committers and PMCs automatically gain [emeritus 
status](https://cwiki.apache.org/confluence/display/Hive//Bylaws#Bylaws-Committers)
 after 6 months of inactivity. TVM has a much smaller PMC. I would support this 
proposal if Lazy Consensus was the voting mechanism.
   
       One of my practical concerns here is that we lose recourse for 
preventing overload of the CI. And while I do want us to be welcoming of new 
projects, I could see the following situation arising that would place 
committers in an awkward position:
       1. A new S0 module is proposed and a majority of PMC members approve. 
The S0 module is accepted.
       2. PRs for the new module begin to merge.
       3. At some point, the integration test suite is added, but this involves 
regression tests that add 1 hour to the CI. The proposer either doesn't have 
time to rewrite the tests, or it's not practical to do so. Meanwhile a 
committer more familiar with the CI flags this, but the proposer cites 
acceptance of the RFC as grounds that the project has chosen to merge the new 
S0 module and its tests.
   
       CI is a bit tricky since it's complex enough that not everyone 
understands it, but it's a shared community resource that underpins the whole 
project. I think that choosing community over code means that we don't 
overextend the project so that we can ensure that such shared resources are 
available to everyone to work towards the project's goals. While I agree the 
community has been generally involved with keeping the CI green, only 4 of 20 
PMCs have made or been involved with 
[changes](https://github.com/apache/tvm/commits/main/Jenkinsfile) to 
`Jenkinsfile` in the past year (and of those, only 2 made changes other than 
updating a CI image). Changing Jenkinsfile is the primary way we do things like 
resharding tests to reduce apparent CI runtime. I'm not sure Lazy majority is 
the right tool to evaluate this aspect of the decision, considering that adding 
S0 modules may involve significant changes to the CI.
   
       To conjure another situation at the other end of the spectrum: Someone 
new to the project proposes that we add an S0 module, but although they get a 
few supporters, they don't manage to get majority PMC support. Perhaps they 
don't know many PMCs, or the PMC doesn't know much about the relevant subject 
area. Is the proposal rejected? Majority seems like it could also be a high bar 
here, and if we intentionally adopt a different voting mechanism to accept new 
projects, the rules seem more confusing for newcomers.
   
      It doesn't make sense to me to have two different voting mechanisms 
involved with adding code (one for PRs, and one for RFCs that propose a bunch 
of PRs to add isolated code). There are many situations today in which 
committers and PMCs can block changes to the project, and we rely on a sense of 
community to avoid such deadlock. I continue to think that fostering consensus 
in the community is the best way to choose community over code. Should we come 
to a point where the community needs to choose between two paths forward, such 
as when making an S1 -> S2 change, and consensus cannot be reached, then that 
is a case where I think Lazy Majority may be a reasonable path to making 
forward progress.
   
   2. **S1 and S2 module lifecycle**. This RFC talks a lot about S0 proposals, 
and it is great that we are creating a welcoming stance towards new 
contributions. However, I would like to see the proposal rounded out with 
discussion of how an S0 module should become S1 and how an S1 module can 
deprecate others (i.e. S1 -> S2). I agree that some of this is covered in the 
RFC today, but because the RFC doesn't include complete discussions around the 
S1 and S2 phases, I think it's hard to understand the S1 and S2 concepts. 
Here's a specific example:
   
       > There can be follow-up steps (S1), such as making a dialect broadly 
accessible in existing compilation flows.
       
       This sentence, inside "Background," discusses the existence of S1 steps, 
giving one example. However, the way it's worded makes it sound like others may 
exist; however those aren't enumerated there. It doesn't really make sense to 
enumerate them in the Background section, but without a full S1 section, there 
isn't a good place to put that info. Here I am thinking of new contributors and 
wondering what a new contributor would do should they want to understand the 
full process. Where could they look?
   
       Some specific questions I can imagine readers having that I think should 
be answered in the same place as theSS0 proposal:
       - How should someone propose to make a module S1? Is it an RFC? What 
voting mechanism is used?
       - Are there additional roadmap or documentation requirements placed on 
modules that become S1?
       - At what point do we decide to deprecate modules (i.e. move other 
modules to S2 state)? What voting mechanism is used for that decision? How long 
is the typical notice period?
   
   5. **What is "stable" in TVM and how long is it stable for?** I'm asking the 
specific questions in the previous point because industry contributors need to 
have an understanding of such things for project planning purposes. Is 
[`GraphExecutor`](https://github.com/apache/tvm/blob/main/python/tvm/contrib/graph_executor.py)
 stable? It's still in `contrib` years after being established as the reference 
for many unittests. 
    
       I would like to see us be okay with making it explicit which modules are 
S0, S1, and S2 in the TVM codebase. Doing this is not intended to make any 
judgement on such code, nor is it intended to imply the quality is poor--it is 
merely intended to guide newcomers so that they understand which paths have 
been well-established in TVM and which are under development.
   
   I'd be happy to discuss any of this further. I think these are the same 
three points I've been asking for in the Discuss post and initially on this 
review thread, and I appreciate that you guys have been iterating on the RFC. 
Hopefully this clarifies my sticking points.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to