comaniac commented on a change in pull request #8497:
URL: https://github.com/apache/tvm/pull/8497#discussion_r695059422
##########
File path: tests/cpp/build_module_test.cc
##########
@@ -199,3 +199,129 @@ TEST(BuildModule, Heterogeneous) {
ICHECK_LT(std::fabs(p_out[i] - (i + (i + 1.0) - (i - 1.0))), 1e-5);
}
}
+
+TEST(BuildModule, ZeroCopy) {
+ /*
+ *
+ * A B
+ * \ /
+ * elemwise_add(out0)
+ * \
+ * C copy
+ * \ /
+ * elemwise_sub(out1)
+ */
+
+ using namespace tvm;
+ using namespace tvm::te;
+
+ auto target_llvm = Target("llvm");
+
+ // The shape of input tensors.
+ const int n = 4;
+ Array<PrimExpr> shape{n};
+
+ auto A = placeholder(shape, DataType::Float(32), "A");
+ auto B = placeholder(shape, DataType::Float(32), "B");
+ auto C = placeholder(shape, DataType::Float(32), "C");
+
+ auto elemwise_add = compute(
+ A->shape, [&A, &B](PrimExpr i) { return A[i] + B[i]; }, "elemwise_add");
+
+ auto copy = placeholder(shape, DataType::Float(32), "__copy");
+ auto elemwise_sub = compute(
+ C->shape, [©, &C](PrimExpr i) { return copy[i] - C[i]; },
"elemwise_sub");
+
+ With<Target> llvm_scope(target_llvm);
+ auto s1 = create_schedule({elemwise_add->op});
+ auto s2 = create_schedule({elemwise_sub->op});
+
+ auto args1 = Array<Tensor>({A, B, elemwise_add});
+ auto args2 = Array<Tensor>({copy, C, elemwise_sub});
+
+ std::unordered_map<Tensor, Buffer> binds;
+ auto lowered_s1 = LowerSchedule(s1, args1, "elemwise_add", binds);
+ auto lowered_s2 = LowerSchedule(s2, args2, "elemwise_sub", binds);
+ Map<tvm::Target, IRModule> inputs = {{target_llvm, lowered_s1},
{target_llvm, lowered_s2}};
+ auto module = build(inputs, Target());
+
+ // Execute the graph and check the correctness.
+ // Setup graph json.
+ std::string json =
Review comment:
I really don't like testing in this way. Hard-coded the expected output
(e.g., assembly, JSON, etc) may make future maintenance difficult. IMHO, it
should be sufficient to just build two modules and set one of them to zero
copy, so that the only difference between these two modules should just be the
execution latency, and their outputs should be the same.
Also, it would be good to also have a Python test so that we could also
demonstrate how this could be used in Python; otherwise no one will know this
feature at all as there's no documentation neither.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]