tqchen commented on code in PR #18876:
URL: https://github.com/apache/tvm/pull/18876#discussion_r2892025957
##########
src/s_tir/transform/lower_thread_allreduce.cc:
##########
@@ -76,16 +76,32 @@ class ThreadAllreduceBuilder final : public StmtExprMutator
{
}
}
Stmt VisitStmt_(const AllocBufferNode* op) final {
+ // In flat IR, alloc_remap_ may not yet be populated when this AllocBuffer
is visited
+ // (the remap is set up by MakeAllreduce which runs during
AttrStmt/Evaluate visit
+ // that appears later in the sequence). We record the original data
pointer and
+ // attempt the remap; if it's not ready, the post-processing pass will
handle it.
+ const VarNode* orig_data_ptr = op->buffer->data.get();
auto node = Downcast<AllocBuffer>(StmtExprMutator::VisitStmt_(op));
- if (auto it = alloc_remap_.find(node->buffer->data.get()); it !=
alloc_remap_.end()) {
- Buffer buf = Downcast<Buffer>(it->second);
- auto write_ptr = node.CopyOnWrite();
- write_ptr->buffer = buf;
+ if (auto it = alloc_remap_.find(orig_data_ptr); it != alloc_remap_.end()) {
+ return RemapAllocBuffer(node, it->second);
+ }
+ // Record for deferred remapping (flat IR case)
+ pending_alloc_buffers_.emplace_back(orig_data_ptr);
+ return node;
+ }
- if (buf.scope() == "shared") {
- write_ptr->body = AttrStmt(buf->data, tir::attr::volatile_scope, 1,
write_ptr->body);
- }
+ /*!
+ * \brief Remap an AllocBuffer node to use the replacement buffer.
+ * \param node The original AllocBuffer node.
+ * \param replacement The replacement buffer.
+ * \return The remapped statement(s).
+ */
+ Stmt RemapAllocBuffer(AllocBuffer node, const Buffer& replacement) {
+ node.CopyOnWrite()->buffer = replacement;
+ if (replacement.scope() == "shared") {
+ Stmt volatile_attr = AttrStmt(replacement->data,
tir::attr::volatile_scope, 1, Evaluate(0));
+ return SeqStmt::Flatten(node, volatile_attr);
Review Comment:
This is not an issue. All three codegen backends use a **persistent
set/flag** when encountering `volatile_scope`, not a scoped approach:
- `codegen_c.cc:1088`: `volatile_buf_.insert(v)` — persistent set
- `codegen_llvm.cc:2038`: `volatile_buf_.insert(v)` — persistent set
- `codegen_spirv.cc:879`: `storage_info_[v].is_volatile = true` — persistent
flag
Once the var is marked volatile, it stays volatile for all subsequent
accesses regardless of the AttrStmt's body content. The `Evaluate(0)` body is
visited as a no-op, but the volatile marking has already been applied. In flat
IR, the codegen visits SeqStmt children sequentially — the volatile_scope
AttrStmt is visited first, then the buffer accesses that follow as siblings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]