Issue 61604
Summary MLIR Affine Dialect Loop Fusion pass appears to not work
Labels new issue
Assignees
Reporter rohany
    I’m playing around with the Affine dialect of MLIR, in particular the Loop Fusion pass. I haven’t been able to get it to work with `mlir-opt` on some simple examples, including the ones from the documentation page here: https://mlir.llvm.org/docs/Passes/#-affine-loop-fusion-fuse-affine-loop-nests.

At a high level, I’m doing the following:
* Copying some MLIR source with fusable loops into a file “testing.mlir”
* Running `bin/mlir-opt testing.mlir —affine-loop-fusion —dump-pass-pipeline`

Here is concrete input and output:

```
➜  build git:(main) ✗ cat ../testing.mlir                                                                                                                                                                                      
func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
  %0 = memref.alloc() : memref<10xf32>
  %1 = memref.alloc() : memref<10xf32>
  %cst = arith.constant 0.000000e+00 : f32
  affine.for %arg2 = 0 to 10 {
    affine.store %cst, %0[%arg2] : memref<10xf32>
    affine.store %cst, %1[%arg2] : memref<10xf32>
  }
  affine.for %arg2 = 0 to 10 {
    %2 = affine.load %0[%arg2] : memref<10xf32>
    %3 = arith.addf %2, %2 : f32
    affine.store %3, %arg0[%arg2] : memref<10xf32>
  }
  affine.for %arg2 = 0 to 10 {
    %2 = affine.load %1[%arg2] : memref<10xf32>
    %3 = arith.mulf %2, %2 : f32
    affine.store %3, %arg1[%arg2] : memref<10xf32>
  }
  return
}

func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>,
                     %arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>,
                     %arg4: memref<10x10xf32>) {
  affine.for %arg5 = 0 to 3 {
    affine.for %arg6 = 0 to 3 {
      %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
      %1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
      %2 = arith.mulf %0, %1 : f32
      affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
    }
  }
  affine.for %arg5 = 0 to 3 {
    affine.for %arg6 = 0 to 3 {
      %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
      %1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
      %2 = arith.addf %0, %1 : f32
      affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
    }
  }
  return
}

➜  build git:(main) ✗ bin/mlir-opt ../testing.mlir --affine-loop-fusion --dump-pass-pipeline
Pass Manager with 1 passes:
builtin.module(affine-loop-fusion{fusion-compute-tolerance=3.000000e-01 fusion-fast-mem-space=0 fusion-local-buf-threshold=0 fusion-maximal=false mode=producer})

module {
  func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
    %alloc = memref.alloc() : memref<10xf32>
    %alloc_0 = memref.alloc() : memref<10xf32>
    %cst = arith.constant 0.000000e+00 : f32
    affine.for %arg2 = 0 to 10 {
      affine.store %cst, %alloc[%arg2] : memref<10xf32>
      affine.store %cst, %alloc_0[%arg2] : memref<10xf32>
    }
    affine.for %arg2 = 0 to 10 {
      %0 = affine.load %alloc[%arg2] : memref<10xf32>
      %1 = arith.addf %0, %0 : f32
      affine.store %1, %arg0[%arg2] : memref<10xf32>
    }
    affine.for %arg2 = 0 to 10 {
      %0 = affine.load %alloc_0[%arg2] : memref<10xf32>
      %1 = arith.mulf %0, %0 : f32
      affine.store %1, %arg1[%arg2] : memref<10xf32>
    }
    return
  }
  func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>, %arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>, %arg4: memref<10x10xf32>) {
    affine.for %arg5 = 0 to 3 {
      affine.for %arg6 = 0 to 3 {
        %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
        %1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
        %2 = arith.mulf %0, %1 : f32
        affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
      }
    }
    affine.for %arg5 = 0 to 3 {
      affine.for %arg6 = 0 to 3 {
        %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
        %1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
        %2 = arith.addf %0, %1 : f32
        affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
      }
    }
    return
  }
}
```

A related bug is that I seem to be unable to set the value of the argument `mode` to the loop fusion pass. Even if I do `—affine-loop-fusion=“mode=greedy”`, the output from the pass manager always reports that `mode` is `producer`.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to