Ali Alsuliman has posted comments on this change.

Change subject: [ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization
......................................................................


Patch Set 16:

(14 comments)

https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java:

Line 89:         if (clusterNodeDomain.cardinality() == null || 
clusterNodeDomain.cardinality() <= 1) {
> can these checks be done in AbstractStableSortPOperator.getRequiredProperti
Done.


Line 102:             if (parent.getOperatorTag() == LogicalOperatorTag.LIMIT 
&& ((LimitOperator) parent).isTopmostLimitOp()) {
> These checks seem to be requirements that top-most limit and running aggrea
I will have to keep these two here. As we discussed, it cannot be done in 
AbstractStableSortPOperator.getRequiredPropertiesForChildren(). Also, I tried 
to move these two to EnforceStructuralPropertiesRule but it caused some 
problems with some queries. I would say let's keep them here for now and 
explore how to solve this issue in general in a separate change.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java:

Line 648:         // TODO(ali): check this
> let's just throw a compilation exception here. same as visitLeftOuterUnnest
Done. Check the error message text if fine.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java:

Line 235:         return visitInputs(op);
> throw an error here? there should not be any forward operators in the plan 
Done. Check the error message text.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java:

Line 1174:             // TODO(ali): would it be better to use a variable 
instead of annotation? that way whenever someone creates
> here's an idea that we should discuss. What if we move no-full-parallel hin
Alright. Let's do it. But we will have to keep USE_DYNAMIC_RANGE when we check 
for full parallel sort in CheckFullParallelSortRule and would like to disable 
full parallel sort.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java
File 
asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java:

Line 310:         visitInternal(op, true);
> Forward is a blocking operator, should we call calculateMemoryUsageForBlock
I would call calculateMemoryUsageForBlockingOperators() and pass frameSize for 
the argument "memSize" since forward op does not require a memory budget. This 
is essentially the same as visitInternal(). This is because the name of this 
method is misleading. As far as I know, it doesn't take into account the stages 
created by blocking operators. It just calculates the memory required 
differently for operators that require memory budget and forward operator 
doesn't. So, I don't think we should call 
calculateMemoryUsageForBlockingOperators() for forward operator. Let me know if 
I am missing something here. BTW, "PlanStagesGenerator.java" is another 
implementation that considers stages and blocking operators properly, and I 
included forward operator there.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java:

Line 173:                     localSamples.getStartOffset(), 
localSamples.getLength());
> let's move ByteArrayAccessibleInputStream to org.apache.hyracks.data.std.ut
Done


Line 175:             int numberOfSamples = localSamplesList.size();
> don't create DateInputStream instance for each tuple. create it once with B
Done


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java:

Line 77:         private static final int NUM_SAMPLES = 100;
> have we thought about making it configurable (probably in a separate change
That would be nice. How about doing it in a separate change and filing an issue 
now so that we don't forget it?


Line 113:                 System.arraycopy(inputFieldValue.getByteArray(), 
inputFieldValue.getStartOffset(), fieldValue, 0,
> can we just produce the output right here at each step (using IAsterixListB
Done. But I had to use 2 list builders, one for building the sample which is a 
list itself, and another for building the list of samples. Not sure if this is 
what you were referring to, though. I also had to change 
GlobalSamplingAggregateDescriptor. Check if it looks good as I have moved the 
code around quite a bit and I may have missed something.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java
File 
hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java:

Line 45:     private LogicalVariable rangeMapVariable;
> minor, just for consistency. An operator assigns variables and consumes exp
Done


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java
File 
hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java:

Line 611:                     break;
> sure. let's discuss. I think it makes sense to unify them at this point. if
Already talked about this I think. For now, let's keep reviewing other parts of 
the change, and come back to this later at the very end and see if we could 
include it in this change. The reason is that we might end up including a big 
change for it even though it might seem small.


Line 666: 
> right, I also doubt that it was ever executed. but if we have a static map 
I passed the range map now even though we know it will not come here. Let's 
leave it to the very end after we are done addressing all the other comments. 
We may be able to fix this in this change.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java
File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java:

Line 69:             sender.close();
> what if one sender's close() fails. should we continue closing the remainin
I would say yes. This implementation is similar to "SortMergeFrameReader" which 
encapsulates "RunMergingFrameReader" but here it gets the next frame 
sequentially one sender/partition at a time instead of picking the "topmost" 
from all the senders (there is a priority Q in RunMergingFrameReader). The 
close() method in "RunMergingFrameReader" closes sender like this. So, I just 
followed the same pattern. What's your thought?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/2393
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f
Gerrit-PatchSet: 16
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Dmitry Lychagin <[email protected]>
Gerrit-Reviewer: Ildar Absalyamov <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to