zhztheplayer commented on code in PR #7842:
URL: https://github.com/apache/incubator-gluten/pull/7842#discussion_r1865418672
##########
cpp/velox/compute/WholeStageResultIterator.cc:
##########
@@ -248,15 +248,47 @@ void WholeStageResultIterator::getOrderedNodeIds(
const std::shared_ptr<const velox::core::PlanNode>& planNode,
std::vector<velox::core::PlanNodeId>& nodeIds) {
bool isProjectNode = (std::dynamic_pointer_cast<const
velox::core::ProjectNode>(planNode) != nullptr);
+ bool isLocalExchangeNode = (std::dynamic_pointer_cast<const
velox::core::LocalPartitionNode>(planNode) != nullptr);
+ bool isUnionNode = isLocalExchangeNode &&
+ std::dynamic_pointer_cast<const
velox::core::LocalPartitionNode>(planNode)->type() ==
+ velox::core::LocalPartitionNode::Type::kGather;
const auto& sourceNodes = planNode->sources();
- for (const auto& sourceNode : sourceNodes) {
+ if (isProjectNode) {
+ GLUTEN_CHECK(sourceNodes.size() == 1, "Illegal state");
+ const auto sourceNode = sourceNodes.at(0);
// Filter over Project are mapped into FilterProject operator in Velox.
// Metrics are all applied on Project node, and the metrics for Filter node
// do not exist.
- if (isProjectNode && std::dynamic_pointer_cast<const
velox::core::FilterNode>(sourceNode)) {
+ if (std::dynamic_pointer_cast<const velox::core::FilterNode>(sourceNode)) {
omittedNodeIds_.insert(sourceNode->id());
}
getOrderedNodeIds(sourceNode, nodeIds);
+ nodeIds.emplace_back(planNode->id());
+ return;
+ }
+
+ if (isUnionNode) {
+ // FIXME: The whole metrics system in gluten-substrait is magic. Passing
metrics trees through JNI with a trivial
+ // array is possible but requires for a solid design. Apparently we
haven't had it. All the code requires complete
+ // rework.
Review Comment:
I found it's really painful to add new metrics in Gluten Velox backend.
There is a set of metrics from Velox query plan being converted to a tree or an
array again and again during passing to Spark Java side. A lot of magical IDs
are used during the tree / array traversal algorithms with a lot of special
handling of corner cases, e.g., join, filter project, and union which is added
in this PR.
The whole metric system should be reworked (if someone could lead this work)
otherwise it's unmaintainable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]