Lunderberg commented on PR #15706: URL: https://github.com/apache/tvm/pull/15706#issuecomment-1721406739
@junrushao True, it does tend to depend on what type of IR somebody tends to look at. Relax IR tends to be very flat, due to the restricted form of the `DataflowBlock`. The TIR examples tend to be more nested, especially with many loops over high-dimensionality buffers. That said, from the example linked there are a number of improvements we could make that would make the TVMScript more readable, with the ease of black-auto-formatting coming along as a side effect. * Avoid extraneous parentheses. Currently, TVMScript gives explicit parentheses for operations, even when the parentheses are unnecessary. `((x_outer * 32768) + (x_c * 1024)) + (k_outer * 4)` could instead be `x_outer * 32768 + x_c * 1024 + k_outer * 4` * The `T.grid` syntax is only used for serial loops. If it had an additional keyword-only argument for the loop types, other nested `for x in T.parallel(0, 32): for y in T.serial(0, 1024):` could instead be written as `for x,y in T.grid(32, 1024, type='PS')`. * `T.ramp` and `T.broadcast` can be printed as python slices into a buffer, rather than explicitly written. This lets `C_global[T.ramp((x_c_init * 32), 1, 32)] = T.broadcast(T.float32(0), 32)` be written as `C_global[x_c_init*32: x_c_init*32 + 32] = T.float32(0)` There are also lowering passes we could implement that would improve performance, with improved TVMScript readability as a side-effect. * Automatic loop fusion. If two nested loops have the same type, that type is not `kVectorized`, and the loop iterators are always used as the expression `iter_outer*extent_inner + iter_inner`, then the loops can be fused into a single loop. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
