Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/1181#discussion_r177617860
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractRecordBatchMemoryManager.java
---
@@ -29,6 +29,50 @@
private int outgoingRowWidth;
private RecordBatchSizer sizer;
+ /**
+ * operator metric stats
+ */
+ private long numIncomingBatches;
+ private long sumInputBatchSizes;
+ private long sumInputRowWidths;
+ private long totalInputRecords;
+ private long numOutgoingBatches;
+ private long sumOutputBatchSizes;
+ private long sumOutputRowWidths;
+ private long totalOutputRecords;
+
+ public long getNumIncomingBatches() {
+ return numIncomingBatches;
+ }
+
+ public long getTotalInputRecords() {
+ return totalInputRecords;
+ }
+
+ public long getNumOutgoingBatches() {
+ return numOutgoingBatches;
+ }
+
+ public long getTotalOutputRecords() {
+ return totalOutputRecords;
+ }
+
+ public long getAvgInputBatchSize() {
+ return RecordBatchSizer.safeDivide(sumInputBatchSizes,
numIncomingBatches);
+ }
+
+ public long getAvgInputRowWidth() {
+ return RecordBatchSizer.safeDivide(sumInputRowWidths,
numIncomingBatches);
+ }
+
+ public long getAvgOutputBatchSize() {
+ return RecordBatchSizer.safeDivide(sumOutputBatchSizes,
numOutgoingBatches);
+ }
+
+ public long getAvgOutputRowWidth() {
+ return RecordBatchSizer.safeDivide(sumOutputRowWidths,
numOutgoingBatches);
--- End diff --
Not sure if it really matters, but this calculation is not accurate. This
is an unweighted average. The actual width requires a weighted average. I have
one batch with a row of 1 MB in size, and another batch of 1K rows of 1K each.
The average row width is actually ~2K, not ~500K.
---