Till Westmann has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/3226 )
Change subject: [ASTERIXDB-2532][RT] per-operator profiling ...................................................................... Patch Set 25: (30 comments) https://asterix-gerrit.ics.uci.edu/#/c/3226/25//COMMIT_MSG Commit Message: https://asterix-gerrit.ics.uci.edu/#/c/3226/25//COMMIT_MSG@9 PS25, Line 9: the analyze : variable is set in a query I think that it would be more consistent, if we used a parameter for the HTTP interface for the profile (similar to the way we do it for the plans). https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/IStatementExecutor.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/IStatementExecutor.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/IStatementExecutor.java@85 PS25, Line 85: transient Why don't we need to serialize the jobProfile? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/AbstractQueryApiServlet.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/AbstractQueryApiServlet.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-app/src/main/java/org/apache/asterix/api/http/server/AbstractQueryApiServlet.java@51 PS25, Line 51: jobProfile Wondering if we should just call this "profile". What do you think? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/profile/full-scan/full-scan.3.profile.sqlpp File asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/profile/full-scan/full-scan.3.profile.sqlpp: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/profile/full-scan/full-scan.3.profile.sqlpp@27 PS25, Line 27: set analyze "true"; I think that it would be more consistent, if we used a parameter for the HTTP interface for the profile (similar to the way we do it for the plans). https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/aggreg/NestedPlansRunningAggregatorFactory.java File hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/aggreg/NestedPlansRunningAggregatorFactory.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/aggreg/NestedPlansRunningAggregatorFactory.java@69 PS25, Line 69: if (profile) { It seems that it should be possible to have both wrappers: if (profile) { fw = TimedFrameWriter.time(fw, ctx, "Aggregate Writer"); } if (enforce) { fw = EnforceFrameWriter.enforce(fw); } Is there a known reason why that wouldn't work? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/PipelineAssembler.java File hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/PipelineAssembler.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/PipelineAssembler.java@65 PS25, Line 65: start it seems that the wrapping code from NestedPlansRunningAggregatorFactory should be applicable here as well https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/PipelineAssembler.java@108 PS25, Line 108: start it seems that the wrapping code from NestedPlansRunningAggregatorFactory should be applicable here as well https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/IPassableTimer.java File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/IPassableTimer.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/IPassableTimer.java@21 PS25, Line 21: IPassableTimer Add some explanation to this interface? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java@40 PS25, Line 40: IStatsCollector collector This should be final, as we synchronize on it. It seems that it should be feasible to initialize it in every constructor. https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java@43 PS25, Line 43: IHyracksTaskContext Do we actually need the full task context or would the IStatsCollector be sufficient? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java@50 PS25, Line 50: TimedFrameWriter Could we make this constructor call the other constructor? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java@61 PS25, Line 61: catch (HyracksDataException e) I think that it might be better to just throw HyracksDataException up the stack. I believe that in this part of the code HyracksDataExceptions are thrown in many places and we'll only have to add a few throws declarations. It would also save us from logging here. https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedFrameWriter.java@56 PS25, Line 56: stats = collector.getOperatorStats(name); : if (collector.getOperatorStats(name) == null) { : stats = new OperatorStats(name); : collector.add(stats); : } : I think that the collector should create new stats if the stats are not tracked by the collector already. https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedOperatorNodePushable.java File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedOperatorNodePushable.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedOperatorNodePushable.java@37 PS25, Line 37: skipFw What does this flag indicate? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedOperatorNodePushable.java@50 PS25, Line 50: if (collector.getOperatorStats(name) == null) { same as for TimedFrameWriter https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/TimedOperatorNodePushable.java@129 PS25, Line 129: 1 Would > 0 work as well? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/job/profiling/IStatsCollector.java File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/job/profiling/IStatsCollector.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/job/profiling/IStatsCollector.java@45 PS25, Line 45: Map add comment? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/job/profiling/IStatsCollector.java@53 PS25, Line 53: takeClock add comment? What is the return value? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/job/profiling/IStatsCollector.java@55 PS25, Line 55: giveClock add comment? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/StatsCollector.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/StatsCollector.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/StatsCollector.java@42 PS25, Line 42: PROFILE_PROP_NAME If we use an HTTP parameter to request the profile for a request, we don't need this anymore. https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/StatsCollector.java@54 PS25, Line 54: return operatorStatsMap.get(operatorName); replace with return operatorStatsMap.computeIfAbsent(operatorName, OperatorStats::new); to always get an Object (would need to fix the comment on the interface as well). https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/om/TaskProfile.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/om/TaskProfile.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/om/TaskProfile.java@112 PS25, Line 112: Map<String, IOperatorStats> opTimes = statsCollector.getAllOperatorStats(); Could this be a method? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/om/TaskProfile.java@121 PS25, Line 121: json.put("counters", countersObj); Does this overwrite the field "counters" that was populated by populateCounters before? If so, could you add a comment why we want to overwrite that field? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/StartTasksWork.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/StartTasksWork.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/StartTasksWork.java@183 PS25, Line 183: (enforce && !profile) Could we have both? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/sort/SortGroupByOperatorDescriptor.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/sort/SortGroupByOperatorDescriptor.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/sort/SortGroupByOperatorDescriptor.java@153 PS25, Line 153: runGen inline? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractExternalSortRunMerger.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractExternalSortRunMerger.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/AbstractExternalSortRunMerger.java@61 PS25, Line 61: ICounter counter; : IStatsCollector collector; : These seem to be unused. Could we revert the file? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/ExternalSortOperatorDescriptor.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/ExternalSortOperatorDescriptor.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/ExternalSortOperatorDescriptor.java@86 PS25, Line 86: runGen inline? https://asterix-gerrit.ics.uci.edu/#/c/3226/19/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/IRunGenerator.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/IRunGenerator.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/19/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/IRunGenerator.java@37 PS19, Line 37: getSorter What is this? Add a comment? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/TopKSorterOperatorDescriptor.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/TopKSorterOperatorDescriptor.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/sort/TopKSorterOperatorDescriptor.java@69 PS25, Line 69: runGen inline? https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/dataflow/IndexSearchOperatorNodePushable.java File hyracks-fullstack/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/dataflow/IndexSearchOperatorNodePushable.java: https://asterix-gerrit.ics.uci.edu/#/c/3226/25/hyracks-fullstack/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/dataflow/IndexSearchOperatorNodePushable.java@166 PS25, Line 166: IStatsCollector collector = ctx.getStatsCollector(); : if (collector != null) { : stats = collector.getOperatorStats(getDisplayName()); : } : if (stats == null) { : stats = new OperatorStats(getDisplayName()); : if (collector != null) { : collector.add(stats); : } : } : Could be simplified if collector.getOperatorStats would always return an existing or a new object (like Map::computeIfAbsent). -- To view, visit https://asterix-gerrit.ics.uci.edu/3226 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie16f3901ae5b32920d8552d5fd1ec8bb6e2ec8ae Gerrit-Change-Number: 3226 Gerrit-PatchSet: 25 Gerrit-Owner: Ian Maxon <ima...@uci.edu> Gerrit-Reviewer: Ali Alsuliman <ali.al.solai...@gmail.com> Gerrit-Reviewer: Anon. E. Moose (1000171) Gerrit-Reviewer: Ian Maxon <ima...@uci.edu> Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Till Westmann <ti...@apache.org> Gerrit-Comment-Date: Mon, 15 Apr 2019 03:00:41 +0000 Gerrit-HasComments: Yes