kimm240 opened a new pull request, #18173:
URL: https://github.com/apache/tvm/pull/18173
This PR introduces an operator fusion for the common conv2d followed by
reshape, add, and relu sequence, commonly found in deep learning models (e.g.,
convolution + bias + activation pattern in PyTorch). This optimization
significantly improves performance and efficiency by reducing overhead and
optimizing memory usage.
Performance Improvement:
Reduced Kernel Launch Overhead: Previously, conv2d, reshape, add, and relu
each required separate kernel calls. By fusing these four operations into a
single, unified DNNL kernel (e.g., dnnl_fused_conv2d_bias_relu), the overhead
from multiple kernel launches is significantly reduced. This is evident from
src/runtime/contrib/dnnl/dnnl.cc:154-158, where all operations are handled by a
single execute call.
Decreased Memory Bandwidth Consumption: Intermediate results of individual
operations (e.g., conv_out, bias_add) traditionally required costly memory
write-backs and reads. Fusion allows these intermediate values to be processed
directly in registers or cache, reducing unnecessary memory accesses, and thus
decreasing memory bandwidth usage and overall execution time.
Increased Efficiency:
Leveraging Compiler Optimizations: By utilizing TVM's FuseOpsByPattern and
MergeCompositeFunctions passes, this change generates a composite operation
optimized for specific backends (like DNNL). This ensures that common patterns
from frontends like PyTorch are automatically recognized within the TVM graph
and mapped to high-performance fused kernels provided by libraries like DNNL.
Simplified IR Module: Compilers' Intermediate Representation (IR) becomes
less complex as multiple operation nodes are condensed into a single composite
node. This simplification enhances efficiency in subsequent optimization and
code generation stages.
This fusion is achieved through a two-stage transformation within the TVM
Relax framework:
Pattern Recognition and Composite Function Creation
(FuseConv2dReshapeAddRelu Pass):
The FuseConv2dReshapeAddRelu class, registered as a
tvm.transform.module_pass, transforms the IRModule.
The _conv2d_reshape_add_relu_pattern() helper function defines the specific
sequence: conv2d -> reshape (applied to bias) -> add -> relu using TVM's
Declarative Pattern Language (DPL). This includes matching input tensors (data,
weight, bias, shape) using wildcard() and identifying operation sequence with
is_op().
The relax.transform.FuseOpsByPattern pass identifies this pattern in the
input IRModule. Upon detection, the operation sequence is encapsulated into a
new Relax function with {"Composite": "dnnl.conv2d_reshape_add_relu",
"Primitive": True} attributes, marking it as a logical "composite" unit.
Composite Function Merging and Codegen Attribute Assignment
(MergeCompositeFunctions Pass):
Following the FuseConv2dReshapeAddRelu pass, the MergeCompositeFunctions
pass is applied via tvm.ir.transform.Sequential.
This pass identifies functions marked with the Composite attribute and
transforms them into external functions bearing the {"Codegen": "dnnl"}
attribute. This Codegen attribute indicates that the composite operation should
be offloaded to a specific TVM backend, such as DNNL.
Consequently, during graph execution, the fused function with the Codegen
attribute will be mapped and executed by an optimized, single DNNL kernel, for
instance, dnnl_fused_conv2d_bias_relu (defined in
src/runtime/contrib/dnnl/dnnl.cc:199-207).
This implementation successfully enables the fusion of the conv2d + reshape
+ add + relu pattern. This ensures that common convolution + bias + activation
patterns originating from frontends like PyTorch are now fully optimized and
executed as a single, highly efficient DNNL kernel within TVM.
To verify this fusion, you can directly run the specific test case:
python tests/python/relax/test_conv2d_reshape_add_relu.py
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]