[
https://issues.apache.org/jira/browse/DRILL-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382245#comment-16382245
]
ASF GitHub Bot commented on DRILL-6197:
---------------------------------------
Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/1141#discussion_r171611927
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentStats.java ---
@@ -79,4 +71,21 @@ public void addOperatorStats(OperatorStats stats) {
operators.add(stats);
}
+ //DRILL-6197
+ public OperatorStats addOrReplaceOperatorStats(OperatorStats stats) {
+ //Remove existing stat
+ OperatorStats replacedStat = null;
+ int index = 0;
+ for (OperatorStats opStat : operators) {
--- End diff --
I am worried about the small overheads of doing this linear search adding
up for each operator, especially for queries with complex query plans. The
stats collection should ideally impose minimal overhead. Does the operator
stats have to be a list or can just use a Set ?
> Duplicate entries in inputProfiles of minor fragments for specific operators
> ----------------------------------------------------------------------------
>
> Key: DRILL-6197
> URL: https://issues.apache.org/jira/browse/DRILL-6197
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Monitoring
> Affects Versions: 1.12.0
> Reporter: Kunal Khatua
> Assignee: Kunal Khatua
> Priority: Minor
> Fix For: 1.13.0
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Minor fragments for the following operators show duplicate entries of the
> inputProfile ({{org.apache.drill.exec.ops.OperatorStats}} instance) when
> viewed in the Profile UI.
> e.g
> {code:json}
> {
> ...
> "query": "select * from sys.version",
> ...
> [ ...
> {
> "inputProfile": [{
> "records": 0,
> "batches": 0,
> "schemas": 0
> }],
> "operatorId": 0,
> "operatorType": 13,
> "setupNanos": 0,
> "processNanos": 0,
> "peakLocalMemoryAllocated": 27131904,
> "waitNanos": 0
> },
> {
> "inputProfile": [{
> "records": 1,
> "batches": 1,
> "schemas": 1
> }],
> "operatorId": 0,
> "operatorType": 13,
> "setupNanos": 0,
> "processNanos": 752448,
> "peakLocalMemoryAllocated": 27131904,
> "metric": [{
> "metricId": 0,
> "longValue": 178
> }],
> "waitNanos": 889492
> }]
> ...
> }
> {code}
> {{operatorType: 13}} is the screen operator, for which there can be only one
> inputProfile.
> It turns out that by default, all minor fragments' operators are provide a
> list of inputProfiles by
> {{org.apache.drill.exec.ops.FragmentStats.newOperatorStats(OpProfileDef,
> BufferAllocator)}}. However, for the following 4 operators, the
> {{org.apache.drill.exec.physical.impl.BaseRootExec}} constructors also inject
> {{OperatorStats}}.
> {code:java}
> org.apache.drill.exec.proto.beans.CoreOperatorType.SCREEN
> org.apache.drill.exec.proto.beans.CoreOperatorType.SINGLE_SENDER
> org.apache.drill.exec.proto.beans.CoreOperatorType.BROADCAST_SENDER
> org.apache.drill.exec.proto.beans.CoreOperatorType.HASH_PARTITION_SENDER
> {code}
> All updates to the inputProfiles are done by the latter, while the former
> only reports zero values.
> The workaround is to have {{org.apache.drill.exec.ops.FragmentStats}} skip
> injecting the {{org.apache.drill.exec.ops.OperatorStats}} instance for these
> operators
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)