kimm240 opened a new pull request, #18171:
URL: https://github.com/apache/tvm/pull/18171
This PR introduces an operator fusion for the common `conv2d` followed by
`reshape`, `add`, and `relu` sequence, commonly found in deep learning models
(e.g., convolution + bias + activation pattern in PyTorch). This optimization
significantly improves performance and efficiency by reducing overhead and
optimizing memory usage.
1. **Performance Improvement:**
* **Reduced Kernel Launch Overhead:** Previously, `conv2d`, `reshape`,
`add`, and `relu` each required separate kernel calls. By fusing these four
operations into a single, unified DNNL kernel (e.g.,
`dnnl_fused_conv2d_bias_relu`), the overhead from multiple kernel launches is
significantly reduced. This is evident from
`src/runtime/contrib/dnnl/dnnl.cc:154-158`, where all operations are handled by
a single `execute` call.
* **Decreased Memory Bandwidth Consumption:** Intermediate results of
individual operations (e.g., `conv_out`, `bias_add`) traditionally required
costly memory write-backs and reads. Fusion allows these intermediate values to
be processed directly in registers or cache, reducing unnecessary memory
accesses, and thus decreasing memory bandwidth usage and overall execution time.
2. **Increased Efficiency:**
* **Leveraging Compiler Optimizations:** By utilizing TVM's
`FuseOpsByPattern` and `MergeCompositeFunctions` passes, this change generates
a composite operation optimized for specific backends (like DNNL). This ensures
that common patterns from frontends like PyTorch are automatically recognized
within the TVM graph and mapped to high-performance fused kernels provided by
libraries like DNNL.
* **Simplified IR Module:** Compilers' Intermediate Representation (IR)
becomes less complex as multiple operation nodes are condensed into a single
composite node. This simplification enhances efficiency in subsequent
optimization and code generation stages.
This fusion is achieved through a two-stage transformation within the TVM
Relax framework:
1. **Pattern Recognition and Composite Function Creation
(`FuseConv2dReshapeAddRelu` Pass):**
* The `FuseConv2dReshapeAddRelu` class, registered as a
`tvm.transform.module_pass`, transforms the `IRModule`.
* The `_conv2d_reshape_add_relu_pattern()` helper function defines the
specific sequence: `conv2d` -> `reshape` (applied to bias) -> `add` -> `relu`
using TVM's Declarative Pattern Language (DPL). This includes matching input
tensors (`data`, `weight`, `bias`, `shape`) using `wildcard()` and identifying
operation sequence with `is_op()`.
* The `relax.transform.FuseOpsByPattern` pass identifies this pattern in
the input `IRModule`. Upon detection, the operation sequence is encapsulated
into a new Relax function with `{"Composite": "dnnl.conv2d_reshape_add_relu",
"Primitive": True}` attributes, marking it as a logical "composite" unit.
2. **Composite Function Merging and Codegen Attribute Assignment
(`MergeCompositeFunctions` Pass):**
* Following the `FuseConv2dReshapeAddRelu` pass, the
`MergeCompositeFunctions` pass is applied via `tvm.ir.transform.Sequential`.
* This pass identifies functions marked with the `Composite` attribute
and transforms them into external functions bearing the `{"Codegen": "dnnl"}`
attribute. This `Codegen` attribute indicates that the composite operation
should be offloaded to a specific TVM backend, such as DNNL.
* Consequently, during graph execution, the fused function with the
`Codegen` attribute will be mapped and executed by an optimized, single DNNL
kernel, for instance, `dnnl_fused_conv2d_bias_relu` (defined in
`src/runtime/contrib/dnnl/dnnl.cc:199-207`).
This implementation successfully enables the fusion of the `conv2d + reshape
+ add + relu` pattern. This ensures that common convolution + bias + activation
patterns originating from frontends like PyTorch are now fully optimized and
executed as a single, highly efficient DNNL kernel within TVM.
---
To verify this fusion, you can directly run the specific test case:
python tests/python/relax/test_conv2d_reshape_add_relu.py
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]