>From Ali Alsuliman <[email protected]>:

Attention is currently required from: Ali Alsuliman, Ian Maxon, Michael Blow.

Ali Alsuliman has posted comments on this change by Ian Maxon. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270?usp=email )

Change subject: [ASTERIXDB-3631][RT] Profile nested groupby clauses
......................................................................


Patch Set 13:

(14 comments)

File 
hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/jobgen/impl/JobBuilder.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/182642f1_d66e2abc?usp=email
 :
PS13, Line 193: List<Integer> paths = new ArrayList<>(inputs.size());
I am having a hard time following the logic for `paths` and 
`Collections.max(paths)` and `prevOpIndex + 1`. Is this so that we match how 
the push runtime is constructing the ids?

What if in `getExtendedOdidForNestedOp()`, we just pass the "baseId" to 
`getExtendedOdidForOperator()` as:
```
int j = 0;
for (Mutable<ILogicalOperator> root : plan.getRoots()) {
    getExtendedOdidForOperator(baseId + "." + i, root.getValue(), log2phys, j);
    j++;
}
```

And then in `getExtendedOdidForOperator()` just continue building:
```
for (int i = 0; i < inputs.size(); i++) {
  if (nextOp instanceof AbstractOperatorWithNestedPlans) {
    getExtendedOdidForNestedOp((AbstractOperatorWithNestedPlans) nextOp, 
log2phys, baseId + "." + input + "." + i);
  }
}
```
`getExtendedOdidForNestedOp()` won't need to return int.


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/27120f3e_b838e81c?usp=email
 :
PS13, Line 199: paths.isEmpty()
Can "paths.isEmpty()" ever be empty? seems like it should always have something.


File 
hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/AlgebricksMetaOperatorDescriptor.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/7c172a12_e57be5b3?usp=email
 :
PS13, Line 105: subIds.length() > 2
Is this reliable to determine it's Subplan? I didn't follow the logic, but it 
feels like we are assuming that there is only single digits between the "."
Is that the case?


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/eacd5f54_4f46d383?usp=email
 :
PS13, Line 140: public
Does `traverseFactories()` need to be `public` or is it just used here and can 
be made `private`?


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/4e6eacfe_c8e46487?usp=email
 :
PS13, Line 153: else
Previously, we were always adding: `microOpStats.put(fact, 
makeStatForRuntimeFact`. Now, it's in the `else` block. What changed? just 
trying to understand.


File 
hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/PipelineAssembler.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/243d43d4_ec7c3163?usp=email
 :
PS13, Line 172: profiling(ctx)
You already have `boolean profile` above. You can replace 
`ctx.getJobFlags().contains(JobFlag.PROFILE_RUNTIME)` with `profiling(ctx)` and 
re-use the boolean variable.


File 
hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/meta/SubplanRuntimeFactory.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/b3ba88c6_0ad8ec04?usp=email
 :
PS13, Line 138: private final boolean profile;
Seems like "profile" can be just a local variable instead of instance variable


File 
hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/std/NestedTupleSourceRuntimeFactory.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/aa4052d8_f120c424?usp=email
 :
PS13, Line 77: INestedTupleSourceRuntime nestedWrapped;
Make it `private final` if we can.


File 
hyracks-fullstack/algebricks/algebricks-runtime/src/main/java/org/apache/hyracks/algebricks/runtime/operators/win/WindowAggregatorDescriptorFactory.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/090edb2c_597d2de2?usp=email
 :
PS13, Line 78: .assemblePipeline(subplan, outputWriter, ctx, 
pipelineRuntimeMap, Collections.emptyMap()).getLeft()
Seems like you are using `Collections.emptyMap()` to indicate non-profiling. We 
should probably have a separate method: 
`PipelineAssembler.assembleNonProfiledPipeline()`. We can do that later.


File 
hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/dataflow/ProfiledFrameWriter.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/54f95f06_218d6eb5?usp=email
 :
PS13, Line 48: protected
I haven't checked. Is this change to `protected` intended or unintended?


File 
hyracks-fullstack/hyracks/hyracks-control/hyracks-control-common/src/main/java/org/apache/hyracks/control/common/job/profiling/StatsCollector.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/f6320113_dbf4182f?usp=email
 :
PS13, Line 85: output.writeInt(operatorStatsMap.size());
             :         for (IOperatorStats stat : operatorStatsMap.values()) {
             :             stat.writeFields(output);
             :         }
Remind me to come back to this and whether `operatorStatsMap` needs to be 
synchronized or not.


File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/preclustered/PreclusteredGroupOperatorNodePushable.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/0455f2ed_b070f0d9?usp=email
 :
PS13, Line 36: PreclusteredGroupOperatorNodePushable
Just checking, when profiling, does this nodepushable become 
"ProfiledOperatorNodePushable"? how is its time captured and conveyed back 
given that it's also doing profiling for the aggregator. Probably the same 
question applies to `AlgebricksMetaOperatorDescriptor`


File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/group/preclustered/PreclusteredGroupWriter.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/f9f4c75a_d53d0904?usp=email
 :
PS13, Line 66: private Runnable timingHandle = Function::identity;
We can just simplify and remove this instance variable. We can just do 
((IProfiledAggregatorDescriptor) aggregator).computeTimings()


File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/util/ProfilingUtils.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270/comment/dafeab66_df0bd141?usp=email
 :
PS13, Line 27: ctx != null
Can `ctx != null` ever be null? I don't think so. We can remove this check.



--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20270?usp=email
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings?usp=email

Gerrit-MessageType: comment
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I516dc90b8da3a7086dc80b67946ac14f6ade0973
Gerrit-Change-Number: 20270
Gerrit-PatchSet: 13
Gerrit-Owner: Ian Maxon <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-CC: Michael Blow <[email protected]>
Gerrit-Attention: Ian Maxon <[email protected]>
Gerrit-Attention: Ali Alsuliman <[email protected]>
Gerrit-Attention: Michael Blow <[email protected]>
Gerrit-Comment-Date: Wed, 28 Jan 2026 18:23:57 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No

Reply via email to