[GitHub] [tvm] lazycal opened a new pull request #10118: Fix LayoutRewriter

GitBox Mon, 31 Jan 2022 13:39:52 -0800


lazycal opened a new pull request #10118:
URL: https://github.com/apache/tvm/pull/10118

Thanks for contributing to TVM! Please refer to guideline
https://tvm.apache.org/docs/contribute/ for useful information and tips. After
the pull request is submitted, please request code reviews from
[Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers)
by @ them in the pull request thread.

Fix #10109
Originally during alter layout pass each node (`LayoutAlternatedExprNode`)
is tied with only one specific layout stored in `old_layout` (or `new_layout`
after altering layout
https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/transform_layout.h#L175-L176).
This layout can be determined by either one of its consumer(s) or producer.
Previously they are implicitly assumed to agree with each other, but this
imposes restrictions on the graphs, and some reasonable graph will also violate
this assumption (see #10109 for a concrete example). Actually this assumption
is more than necessary for this pass. All we need is to the "delta" information
of how each tensor's layout is transformed, and we don't really need to know
what layouts they are before or after change.

This PR removes this assumption. Now the specific layouts stored in
`old_layout` and `new_layout` do not matter, instead, they only serve to convey
that the tensor changed by the transformation `old_layout`->`new_layout`. With
the new change, the alter layout pass can be understood as follows (at least
IMO):

1. Visit each node in Post DFS order.
2. For each node, call the specific alter layout function. Its
`InferCorrectLayout` function will tell us the "expected" input layouts from a
consumer's perspective (e.g., conv_nchw will say 1st input is NCHW and 2nd is
OIHW), on both the graphs before and after rewrite, denoted by `old_in2` and
`new_in2`. When they differ, it means that we need to apply the transformation
`old_in2`->`new_in2` on the **original** inputs to work properly
(https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L384).
3. Note that the previous transformation is assuming original input, thus
prior to that we need to transform the inputs back to its layout
(https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L383).
4. Apparently the two transformations can be fused by aligning the layout
letters, e.g., `layout_transform(CN->NC) + layout_transform(NW->NW8w)` is
equivalent to `layout_transform(CN->NC8c)`. The previous assumption assumes
that they are already aligned, thus only one transformation was inserted
(https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L381).
I kept this case as well.

The only thing that I didn't fully check is `InferCorrectLayout`s. I am not
fully aware of its formal semantic, but I assumed that they should return
`old_in2` and `new_in2` that are transformable. Some ops may not conform to
this in order to workaround the previous restriction. E.g., I found that the
dense_op has this specifal hack
(https://github.com/apache/tvm/compare/main...lazycal:fix-layout-pass?expand=1#diff-b1f7105acbdb593d30dc3f1506f8f226d8663164bf0e46702f8b050b056604f6L213-L216).
So I just removed it.

## Misc
Another potential bug I found is that the AlterOpLayout pass seems to be
overloaded. People often use it to drop a subgraph to replace a certain node
(e.g.,
https://github.com/apache/tvm/blob/d1dafbd6d050ddcd673f567c5ff720a6b6419cfe/python/tvm/topi/x86/conv2d_alter_op.py#L171-L181),
while the current code logic in `transform_layout.h` assumes only replacing a
node. For example,
https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/transform_layout.h#L388,
this transformation is performed on the args to the call object, but I
believe in the subgraph case it's should be changed to the args to the subgraph.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] lazycal opened a new pull request #10118: Fix LayoutRewriter

Reply via email to