roastduck opened a new pull request #5193: [TE] Support mixing normal and cross-thread reduction URL: https://github.com/apache/incubator-tvm/pull/5193 Currently TVM only supports pure normal (i.e. sequential) reduction or pure cross-thread reduction. Since TVM has not supported nested reduction yet, one is even unable to schedule a mixed reduction manually. I modified the function that lowers cross-thread reduction, to support mixed reduction as well. The approach is straight forward: First perform normal reduction into local variables in each threads first, and then invoke the original cross-thread reduction intrinsic. It works like this (pseudo-code): ```c++ // Divide the loop nest into two parts normal_red = sequantial loops nest common = other loops nest // If normal_red is empty, fallback to original code normal_init = generate init for the temp var normal_update = generate sequential reduction on the temp var body genereate cross-thread reduction // original code // Merge loop nests and add some checks body = SeqStmt(normal_init, MergeNest(normal_red, normal_update), body) body = MergeNest(common, body) return body ``` A test case is added as a Python unit test. This is my first PR to TVM, and I am not sure whom to invite as a reviewer. Since this is compiling pass related, @tqchen can you review my code?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
