yzh119 commented on PR #89:
URL: https://github.com/apache/tvm-rfcs/pull/89#issuecomment-1271521751

   I'm a graduate researcher at UW and have been working as a full-time SDE at 
AWS AI for years, mostly around Deep Learning Frameworks Libraries. I feel like 
all of us agree dynamic shapes are essential so I don't want to spend more time 
emphasizing how important it is. I'm not a contributor to Relax, but I have 
been following it for a long time. I don't want to pretend to be neutral, I do 
think it is quite necessary to welcome Relax, rather than just adding dynamic 
shape support in Relay.
   
   The main controversy in this thread is about whether to upgrade Relay 
incrementally or develop a new IR called Relax. I understand hardware companies 
appreciate stability, and we can see CUDA didn't change its interface 
drastically over the years, what a miracle! There must be several times people 
wanted to develop new languages/compilers for NVIDIA GPUs but CUDA survived, 
this is a lesson we should learn: in the beginning, we design things with a 
vision of the future in mind, then we maintain them with high standard, improve 
it incrementally and be customer-obsessed.
   
   This is the ideal story, but we should not ignore that though CUDA was 
invented before the DL era, there are already many high-performance computing 
workloads the designer can refer to. Fortunately, even in 2022, the operators 
used in DL still highly align with HPC ones and are actually simpler (it's a 
world of GEMM). What about the story of (computational) graph-level IRs? The 
dominant workload in DL changes over time and I would say they cause a lot of 
headaches for framework and compiler designers: first 
CNNs/RNNs/LSTMs/Tree-LSTMs(the structure dynamism is one of the challenges 
Relay would like to tackle, but unfortunately they are used nowhere), then we 
have Transformers/GNNs(not as hot as Transformers because of hardware lottery, 
but who knows the future). Now we have entered a time where models converge, 
but scalability grows significantly: models become larger and larger, and a lot 
of engineers and researchers propose (checkpointing and rematerialization, 
quantization, grap
 h substitution, fusion and stitching, sparsification and mixture-of-experts, 
hybrid parallelism) to optimize DL workloads at compile-time, and I'm glad to 
see many of them are developed upon TVM because TVM's design is always 
up-to-date and support new workloads quickly, however, Relay's current design 
cannot take full advantage of these new techniques, and the system has the 
trend of becoming fragile. Relax is a great opportunity for us to reconsider 
the graph-level IR design: prune the redundancies and add new functionalities, 
it's exciting to see we can unify different levels of optimizations together in 
[TVM Unity](https://github.com/apache/tvm-rfcs/pull/91), once Relax is accepted 
by the community. Refactor makes things simpler, rather than complex.
   
   Whenever we found it's time to make some changes, TVM always embraces new 
designs. This happens several times in TVM history: Prior to Relay, there is 
NNVM, which is deprecated and completely replaced with Relay. The previous 
Tensor-Expression has limited expressiveness, and the schedule tree data 
structure cannot support tensorization elegantly, then we have TensorIR, which 
is not only backward compatible, but also brings opportunities for developing 
new dialects (Ruihang and I designed SparseTIR upon it, works pretty good). The 
AutoTVM cannot generate scheduling templates automatically, then we have Ansor 
and Meta-Scheduler. I would emphasize that **the most important part of all 
these updates are upstreamed within several months**, and do not break any 
backward compatibility issue, it credits to our hard-working and open-minded 
contributors and reviewers. Committing to TVM helps these contributors become 
MLC experts, some of them are PMC members now. I would say non of these re
 factoring influences TVM's reputation, on the contrary, it makes people 
impressed by TVM's speed in adapting to the future, and they are more willing 
to try TVM because it's open, it's driven by innovation.
   
   I really don't understand what's the difference this time, when it comes to 
Relax? We have a bigger community, this is awesome and I definitely welcome 
your input and constructive suggestions on the future of this project. I view 
the [New Scoped Module 
RFC](https://discuss.tvm.apache.org/t/process-rfc-empowering-new-scoped-module-to-the-project/13617)
 as a contract between industrial developers and researchers/engineers like me 
that works on "toy prototypes", we promise not to touch anything that might 
influence user experience, we also don't want to be discouraged because my 
prototypes cannot be upstreamed and only stay in some random GitHub repo as a 
toy. I also think the new S0-S1-S2 progress is already the most painless 
approach to delivering new designs, and the effect is equivalent to 
*incremental change*. If people take a look at the Relax repo, it already has a 
huge amount of code there and well-written documentation (you can compare it 
with the official relay documentatio
 n), I think it's super inappropriate to ignore these contributors' devotion, 
especially individual contributors such as @LeshengJin . TVM has a huge user 
base of researchers, they are an important part of the community, and they also 
contribute high-quality code instead of just hacking.
   
   Regarding the "lower standard than other communities" issue, TVM has high 
standards and we are not talking about standards. If no fundamental changes are 
allowed in DL infrastructures, google should stay at TF 1.0 and never develop 
JAX, and PyTorch should not create so many different compiler infrastructures 
(I want to share [this 
slide](https://chips-compilers-mlsys-22.github.io/assets/slides/PyTorch%20Compilers%20(Compiler%20&%20Chips%20Symposium%202022).pdf)
 again.
   
   It's 5 am in my timezone, I should have some sleep and I'm still recovering 
from my recent illness. Opinions on my own and I don't speak for any 
groups/organizations.
   
   Best,
   Zihao


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to