Ali Alsuliman has posted comments on this change. Change subject: [ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization ......................................................................
Patch Set 16: (14 comments) https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java: Line 89: if (clusterNodeDomain.cardinality() == null || clusterNodeDomain.cardinality() <= 1) { > can these checks be done in AbstractStableSortPOperator.getRequiredProperti Done. Line 102: if (parent.getOperatorTag() == LogicalOperatorTag.LIMIT && ((LimitOperator) parent).isTopmostLimitOp()) { > These checks seem to be requirements that top-most limit and running aggrea I will have to keep these two here. As we discussed, it cannot be done in AbstractStableSortPOperator.getRequiredPropertiesForChildren(). Also, I tried to move these two to EnforceStructuralPropertiesRule but it caused some problems with some queries. I would say let's keep them here for now and explore how to solve this issue in general in a separate change. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java: Line 648: // TODO(ali): check this > let's just throw a compilation exception here. same as visitLeftOuterUnnest Done. Check the error message text if fine. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java: Line 235: return visitInputs(op); > throw an error here? there should not be any forward operators in the plan Done. Check the error message text. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java: Line 1174: // TODO(ali): would it be better to use a variable instead of annotation? that way whenever someone creates > here's an idea that we should discuss. What if we move no-full-parallel hin Alright. Let's do it. But we will have to keep USE_DYNAMIC_RANGE when we check for full parallel sort in CheckFullParallelSortRule and would like to disable full parallel sort. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java File asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java: Line 310: visitInternal(op, true); > Forward is a blocking operator, should we call calculateMemoryUsageForBlock I would call calculateMemoryUsageForBlockingOperators() and pass frameSize for the argument "memSize" since forward op does not require a memory budget. This is essentially the same as visitInternal(). This is because the name of this method is misleading. As far as I know, it doesn't take into account the stages created by blocking operators. It just calculates the memory required differently for operators that require memory budget and forward operator doesn't. So, I don't think we should call calculateMemoryUsageForBlockingOperators() for forward operator. Let me know if I am missing something here. BTW, "PlanStagesGenerator.java" is another implementation that considers stages and blocking operators properly, and I included forward operator there. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java: Line 173: localSamples.getStartOffset(), localSamples.getLength()); > let's move ByteArrayAccessibleInputStream to org.apache.hyracks.data.std.ut Done Line 175: int numberOfSamples = localSamplesList.size(); > don't create DateInputStream instance for each tuple. create it once with B Done https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java: Line 77: private static final int NUM_SAMPLES = 100; > have we thought about making it configurable (probably in a separate change That would be nice. How about doing it in a separate change and filing an issue now so that we don't forget it? Line 113: System.arraycopy(inputFieldValue.getByteArray(), inputFieldValue.getStartOffset(), fieldValue, 0, > can we just produce the output right here at each step (using IAsterixListB Done. But I had to use 2 list builders, one for building the sample which is a list itself, and another for building the list of samples. Not sure if this is what you were referring to, though. I also had to change GlobalSamplingAggregateDescriptor. Check if it looks good as I have moved the code around quite a bit and I may have missed something. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java: Line 45: private LogicalVariable rangeMapVariable; > minor, just for consistency. An operator assigns variables and consumes exp Done https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java File hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java: Line 611: break; > sure. let's discuss. I think it makes sense to unify them at this point. if Already talked about this I think. For now, let's keep reviewing other parts of the change, and come back to this later at the very end and see if we could include it in this change. The reason is that we might end up including a big change for it even though it might seem small. Line 666: > right, I also doubt that it was ever executed. but if we have a static map I passed the range map now even though we know it will not come here. Let's leave it to the very end after we are done addressing all the other comments. We may be able to fix this in this change. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java: Line 69: sender.close(); > what if one sender's close() fails. should we continue closing the remainin I would say yes. This implementation is similar to "SortMergeFrameReader" which encapsulates "RunMergingFrameReader" but here it gets the next frame sequentially one sender/partition at a time instead of picking the "topmost" from all the senders (there is a priority Q in RunMergingFrameReader). The close() method in "RunMergingFrameReader" closes sender like this. So, I just followed the same pattern. What's your thought? -- To view, visit https://asterix-gerrit.ics.uci.edu/2393 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f Gerrit-PatchSet: 16 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Dmitry Lychagin <[email protected]> Gerrit-Reviewer: Ildar Absalyamov <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
