wrongtest opened a new pull request #10887: URL: https://github.com/apache/tvm/pull/10887
Hi there, this pr aims to make some workload schedulable with `reorder` primitive. We can find a nice description for similar workloads in ethosu's cascade scheduler work: https://github.com/apache/tvm-rfcs/blob/main/rfcs/0037-arm-ethosu-cascading-scheduler.md. https://github.com/apache/tvm-rfcs/raw/main/resources/cascading-diagram.png Generally, if we have consecutive ops like conv and pooling, tiling the last one, and `compute_at` others under the outer loops, we then create cascade tiles simultaneously. The block binding for sub-blocks (except last) are not affine, since they have overlapped tile regions, due to non-trivial strides and window size. Under current check, we can not `reorder` each sub-block's inner loops to perform subsequent optimizations, since a global affine binding is required. But note that if we fix outer loops, the block binding wrt inner loops generally keep affineness. The pr try to allow `reorder` in this situation. The example script below builds two consecutive pooling op from `te` and schedule them with tir: ```python from tvm import topi x = tvm.te.placeholder(shape=[1, 16, 112, 112], name="x") y1 = topi.nn.pool2d(x, [3, 3], [1, 1], [1, 1], [0, 0, 0, 0], pool_type="max") y2 = topi.nn.pool2d(y1, [3, 3], [1, 1], [1, 1], [0, 0, 0, 0], pool_type="max") f = tvm.te.create_prim_func([x, y2]) s = tvm.tir.schedule.Schedule(f) n, c, h, w, kh, kw = s.get_loops(s.get_block("tensor_1")) ho, hi = s.split(h, factors=[None, 4]) s.compute_at(s.get_block("tensor"), ho) v2, v3 = s.get_loops(s.get_block("tensor"))[-2:] s.reorder(v2, v3) # affine check failure! ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
