[GitHub] [tvm-rfcs] yzh119 commented on pull request #89: [RFC] Relax Upstreaming

GitBox Fri, 07 Oct 2022 05:23:10 -0700


yzh119 commented on PR #89:
URL: https://github.com/apache/tvm-rfcs/pull/89#issuecomment-1271521751

I'm a graduate researcher at UW and have been working as a full-time SDE at
AWS AI for years, mostly around Deep Learning Frameworks Libraries. I feel like
all of us agree dynamic shapes are essential so I don't want to spend more time
emphasizing how important it is. I'm not a contributor to Relax, but I have
been following it for a long time. I don't want to pretend to be neutral, I do
think it is quite necessary to welcome Relax, rather than just adding dynamic
shape support in Relay.

The main controversy in this thread is about whether to upgrade Relay
incrementally or develop a new IR called Relax. I understand hardware companies
appreciate stability, and we can see CUDA didn't change its interface
drastically over the years, what a miracle! There must be several times people
wanted to develop new languages/compilers for NVIDIA GPUs but CUDA survived,
this is a lesson we should learn: in the beginning, we design things with a
vision of the future in mind, then we maintain them with high standard, improve
it incrementally and be customer-obsessed.

This is the ideal story, but we should not ignore that though CUDA was
invented before the DL era, there are already many high-performance computing
workloads the designer can refer to. Fortunately, even in 2022, the operators
used in DL still highly align with HPC ones and are actually simpler (it's a
world of GEMM). What about the story of (computational) graph-level IRs? The
dominant workload in DL changes over time and I would say they cause a lot of
headaches for framework and compiler designers: first
CNNs/RNNs/LSTMs/Tree-LSTMs(the structure dynamism is one of the challenges
Relay would like to tackle, but unfortunately they are used nowhere), then we
have Transformers/GNNs(not as hot as Transformers because of hardware lottery,
but who knows the future). Now we have entered a time where models converge,
but scalability grows significantly: models become larger and larger, and a lot
of engineers and researchers propose (checkpointing and rematerialization,
quantization, grap
h substitution, fusion and stitching, sparsification and mixture-of-experts,
hybrid parallelism) to optimize DL workloads at compile-time, and I'm glad to
see many of them are developed upon TVM because TVM's design is always
up-to-date and support new workloads quickly, however, Relay's current design
cannot take full advantage of these new techniques, and the system has the
trend of becoming fragile. Relax is a great opportunity for us to reconsider
the graph-level IR design: prune the redundancies and add new functionalities,
it's exciting to see we can unify different levels of optimizations together in
[TVM Unity](https://github.com/apache/tvm-rfcs/pull/91), once Relax is accepted
by the community. Refactor makes things simpler, rather than complex.

Whenever we found it's time to make some changes, TVM always embraces new
designs. This happens several times in TVM history: Prior to Relay, there is
NNVM, which is deprecated and completely replaced with Relay. The previous
Tensor-Expression has limited expressiveness, and the schedule tree data
structure cannot support tensorization elegantly, then we have TensorIR, which
is not only backward compatible, but also brings opportunities for developing
new dialects (Ruihang and I designed SparseTIR upon it, works pretty good). The
AutoTVM cannot generate scheduling templates automatically, then we have Ansor
and Meta-Scheduler. I would emphasize that **the most important part of all
these updates are upstreamed within several months**, and do not break any
backward compatibility issue, it credits to our hard-working and open-minded
contributors and reviewers. Committing to TVM helps these contributors become
MLC experts, some of them are PMC members now. I would say non of these re
factoring influences TVM's reputation, on the contrary, it makes people
impressed by TVM's speed in adapting to the future, and they are more willing
to try TVM because it's open, it's driven by innovation.

I really don't understand what's the difference this time, when it comes to
Relax? We have a bigger community, this is awesome and I definitely welcome
your input and constructive suggestions on the future of this project. I view
the [New Scoped Module
RFC](https://discuss.tvm.apache.org/t/process-rfc-empowering-new-scoped-module-to-the-project/13617)
as a contract between industrial developers and researchers/engineers like me
that works on "toy prototypes", we promise not to touch anything that might
influence user experience, we also don't want to be discouraged because my
prototypes cannot be upstreamed and only stay in some random GitHub repo as a
toy. I also think the new S0-S1-S2 progress is already the most painless
approach to delivering new designs, and the effect is equivalent to
*incremental change*. If people take a look at the Relax repo, it already has a
huge amount of code there and well-written documentation (you can compare it
with the official relay documentatio
n), I think it's super inappropriate to ignore these contributors' devotion,
especially individual contributors such as @LeshengJin . TVM has a huge user
base of researchers, they are an important part of the community, and they also
contribute high-quality code instead of just hacking.

Regarding the "lower standard than other communities" issue, TVM has high
standards and we are not talking about standards. If no fundamental changes are
allowed in DL infrastructures, google should stay at TF 1.0 and never develop
JAX, and PyTorch should not create so many different compiler infrastructures
(I want to share [this
slide](https://chips-compilers-mlsys-22.github.io/assets/slides/PyTorch%20Compilers%20(Compiler%20&%20Chips%20Symposium%202022).pdf)
again.

It's 5 am in my timezone, I should have some sleep and I'm still recovering
from my recent illness. Opinions on my own and I don't speak for any
groups/organizations.

Best,
Zihao

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm-rfcs] yzh119 commented on pull request #89: [RFC] Relax Upstreaming

Reply via email to