lidavidm commented on a change in pull request #12100:
URL: https://github.com/apache/arrow/pull/12100#discussion_r783424580



##########
File path: cpp/src/arrow/compute/function.cc
##########
@@ -215,9 +215,15 @@ Result<Datum> Function::Execute(const std::vector<Datum>& 
args,
   }
 
   util::tracing::Span span;
-  START_SPAN(span, name(),
-             {{"detail", options ? options->ToString() : "<NULLPTR>"},
-              {"function.kind", kind()}});
+  START_SPAN(
+      span, name(),
+      {{"function.args", std::accumulate(std::begin(args), std::end(args), 
std::string(),
+                                         [](std::string& ss, const Datum& d) {
+                                           return ss.empty() ? d.ToString()
+                                                             : ss + "," + 
d.ToString();
+                                         })},
+       {"function.options", options ? options->ToString() : "<NULLPTR>"},
+       {"function.kind", kind()}});

Review comment:
       nit, but is the kind useful? We'll know from the definition and/or from 
context (e.g. GroupByNode vs ProjectNode)

##########
File path: cpp/src/arrow/compute/exec/aggregate_node.cc
##########
@@ -393,7 +422,11 @@ class GroupByNode : public ExecNode {
     // Execute aggregate kernels
     for (size_t i = 0; i < agg_kernels_.size(); ++i) {
       util::tracing::Span span;
-      START_SPAN(span, "Aggregate", {{"function", aggs_[i].function}});
+      START_SPAN(span, aggs_[i].function,
+                 {{"function", aggs_[i].function},

Review comment:
       We don't record this attribute in function.cc, we should be consistent 
about this.

##########
File path: cpp/src/arrow/compute/exec/union_node.cc
##########
@@ -107,27 +106,34 @@ class UnionNode : public ExecNode {
     if (input_count_.Increment()) {
       outputs_[0]->InputFinished(this, total_batches_.load());
       if (batch_count_.SetTotal(total_batches_.load())) {
-        END_SPAN(span);
         finished_.MarkFinished();
       }
     }
   }
 
   Status StartProducing() override {
-    START_SPAN(span, kind_name(), {{"node", ToString()}});
+    START_SPAN(span_, std::string(kind_name()) + ":" + label(),
+               {{"node.label", label()},
+                {"node.detail", ToString()},
+                {"node.kind", kind_name()}});
     finished_ = Future<>::Make();
+#ifdef ARROW_WITH_OPENTELEMETRY
+    finished_.AddCallback([this](const Status& st) {

Review comment:
       I know I suggested this, but there is one concern here: once finished_ 
completes, the node may be destroyed. (Once all nodes finish, the plan 
finishes, and then the plan consumer might also destroy the plan since it's 
done.) Also, the order in which callbacks is run is not defined. It might be 
safest to copy `span_` into the span closure instead of capturing `this`. (Or, 
you could do `finished_ = finished_.Then(...)` which would mean that this 
callback gets to run before others, which would ensure it runs before anything 
that could potentially destroy the plan.)

##########
File path: cpp/src/arrow/compute/function.cc
##########
@@ -215,9 +215,15 @@ Result<Datum> Function::Execute(const std::vector<Datum>& 
args,
   }
 
   util::tracing::Span span;
-  START_SPAN(span, name(),
-             {{"detail", options ? options->ToString() : "<NULLPTR>"},
-              {"function.kind", kind()}});
+  START_SPAN(
+      span, name(),
+      {{"function.args", std::accumulate(std::begin(args), std::end(args), 
std::string(),
+                                         [](std::string& ss, const Datum& d) {
+                                           return ss.empty() ? d.ToString()
+                                                             : ss + "," + 
d.ToString();
+                                         })},
+       {"function.options", options ? options->ToString() : "<NULLPTR>"},
+       {"function.kind", kind()}});

Review comment:
       Same with the args - that'll give us datum types (array vs scalar), but 
I'm not sure what kind of analysis we'd do on that. (That said if collecting is 
cheap we might as well just always collect.)

##########
File path: cpp/src/arrow/compute/exec/aggregate_node.cc
##########
@@ -393,7 +422,11 @@ class GroupByNode : public ExecNode {
     // Execute aggregate kernels
     for (size_t i = 0; i < agg_kernels_.size(); ++i) {
       util::tracing::Span span;
-      START_SPAN(span, "Aggregate", {{"function", aggs_[i].function}});
+      START_SPAN(span, aggs_[i].function,
+                 {{"function", aggs_[i].function},
+                  {"function.options",
+                   aggs_[i].options ? aggs_[i].options->ToString() : 
"<NULLPTR>"},
+                  {"function.kind", "consume"}});

Review comment:
       maybe ScalarAggregateConsume/HashAggregateConsume etc.?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to