mdw-octoml commented on a change in pull request #8503:
URL: https://github.com/apache/tvm/pull/8503#discussion_r672796586



##########
File path: src/runtime/profiling.cc
##########
@@ -228,6 +228,71 @@ String ReportNode::AsCSV() const {
   return s.str();
 }
 
+namespace {
+void print_metric(std::ostream& os, ObjectRef o) {
+  if (o.as<StringObj>()) {
+    os << "\"" << Downcast<String>(o) << "\"";
+  } else if (const CountNode* n = o.as<CountNode>()) {
+    os << "{\"count\":" << std::to_string(n->value) << "}";
+  } else if (const DurationNode* n = o.as<DurationNode>()) {
+    os << "{\"microseconds\":" << std::to_string(n->microseconds) << "}";
+  } else if (const PercentNode* n = o.as<PercentNode>()) {
+    os << "{\"percent\":" << std::to_string(n->percent) << "}";
+  } else {
+    LOG(FATAL) << "Unprintable type " << o->GetTypeKey();
+  }
+}
+}  // namespace
+
+String ReportNode::AsJSON() const {
+  std::ostringstream s;
+  // DMLC's JSONWriter does not allow us to write a key value pair without
+  // implementing Write for the value. We want a specific write for the value,
+  // so we would have to implement a custom data structure for each type of
+  // value we want to print. Instead we construct the json by hand because it
+  // is easier.
+  s << "{";
+  s << "\"calls\":[";
+  for (size_t i = 0; i < calls.size(); i++) {
+    size_t j = 0;
+    s << "{";
+    for (const auto& kv : calls[i]) {
+      s << "\"" << kv.first << "\":";
+      print_metric(s, kv.second);
+      if (j < calls[i].size() - 1) {
+        s << ",";
+      }
+      j++;
+    }
+    s << "}";
+    if (i < calls.size() - 1) {
+      s << ",";
+    }
+  }
+  s << "],";
+  s << "\"device_metrics\":{";
+  size_t i = 0;
+  for (const auto& dev_kv : device_metrics) {
+    size_t j = 0;
+    s << "\"" << dev_kv.first << "\":{";
+    for (const auto& metric_kv : dev_kv.second) {
+      s << "\"" << metric_kv.first << "\":";

Review comment:
       I know this is being really nit-picky, but looking at how these profile 
reports are being generated, having the key be `Duration (us)` is kind of silly 
in the case of a JSON-formatted object (I understand why it should be 
human-readable in this way for the text report). It would be much better, IMHO, 
just for this to be `Duration` since the `microseconds` field in the child 
disambiguates the units. The easy way to achieve this here is to have 
`print_metric` print both the key and the value for that key, so it can decide 
based on the type of the node what to print.
   
   The reason this matters? Mainly so that the JSON schema can evolve over time 
and not have to tie values under "Duration" to the concept of "us", in case 
those units were to change at some point.

##########
File path: include/tvm/runtime/profiling.h
##########
@@ -205,6 +205,28 @@ class ReportNode : public Object {
    *  `aggregate` is true.
    */
   String AsTable(bool sort = true, bool aggregate = true) const;
+  /*! \brief Convert this report to json.
+   *
+   * Output json will be of this format:

Review comment:
       Also, recommend using `JSON` (not `json`) in the comments for the sake 
of the autogenerated documentation.

##########
File path: python/tvm/runtime/profiling/__init__.py
##########
@@ -47,6 +47,36 @@ def csv(self):
         """
         return _ffi_api.AsCSV(self)
 
+    def json(self):
+        """Convert this profiling report into JSON format.
+
+        Example output:
+
+        .. code-block:
+
+           {"calls":

Review comment:
       See comment on C++ above, would be great if this were pretty-printed for 
readability (in the comment itself, no need for the actual JSON output).

##########
File path: include/tvm/runtime/profiling.h
##########
@@ -205,6 +205,28 @@ class ReportNode : public Object {
    *  `aggregate` is true.
    */
   String AsTable(bool sort = true, bool aggregate = true) const;
+  /*! \brief Convert this report to json.
+   *
+   * Output json will be of this format:

Review comment:
       My eyes :-) Would you mind pretty printing this JSON in the comment? It 
makes the format more readable.

##########
File path: tests/python/unittest/test_runtime_profiling.py
##########
@@ -109,5 +110,21 @@ def test_papi(target, dev):
     assert any([float(x) > 0 for x in csv[metric]])
 
 
[email protected]_llvm
+def test_json():
+    mod, params = mlp.get_workload(1)
+
+    exe = relay.vm.compile(mod, "llvm", params=params)
+    vm = profiler_vm.VirtualMachineProfiler(exe, tvm.cpu())
+
+    data = np.random.rand(1, 1, 28, 28).astype("float32")
+    report = vm.profile(data, func_name="main")
+    parsed = json.loads(report.json())
+    assert "device_metrics" in parsed
+    assert "calls" in parsed
+    assert "Duration (us)" in parsed["calls"][0]
+    assert "microseconds" in parsed["calls"][0]["Duration (us)"]

Review comment:
       To strengthen this test, could you add some assertions like:
   
   ```
   assert len(parsed["calls"]) > 0
   for call in parsed["calls"]:
     assert isinstance(call["Name"], str)
     assert isinstance(call["Count"]["count"], int)
     assert isinstance(call["Duration"]["microseconds"], float)
   ```

##########
File path: python/tvm/runtime/profiling/__init__.py
##########
@@ -47,6 +47,36 @@ def csv(self):
         """
         return _ffi_api.AsCSV(self)
 
+    def json(self):
+        """Convert this profiling report into JSON format.
+
+        Example output:
+
+        .. code-block:
+
+           {"calls":

Review comment:
       See comment on C++ above, would be great if this were pretty-printed for 
readability (in the comment itself, no need for the actual JSON output).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to