Dmitry Lychagin has posted comments on this change. Change subject: [ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization ......................................................................
Patch Set 11: (18 comments) still reviewing, but here's my first batch of comments. https://asterix-gerrit.ics.uci.edu/#/c/2393/11/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineLeftNtsInSubplanJoinFlatteningVisitor.java File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineLeftNtsInSubplanJoinFlatteningVisitor.java: Line 385: public ILogicalOperator visitForwardOperator(ForwardOperator op, Void arg) throws AlgebricksException { union/intersect operators are not supported by this visitor. they throw throw new UnsupportedOperationException( "Nested subplans with a union operator should have been disqualified for this rewriting!") should do the same for forward operator? and make sure that this visitor does not run on a subplan with forward operator. https://asterix-gerrit.ics.uci.edu/#/c/2393/11/asterixdb/asterix-lang-common/src/main/java/org/apache/asterix/lang/common/clause/OrderbyClause.java File asterixdb/asterix-lang-common/src/main/java/org/apache/asterix/lang/common/clause/OrderbyClause.java: Line 103: return isRPSSort; need to copy this new property in CloneAndSubstituteVariablesVisitor.visit(OrderbyClause) Line 106: public void setRPSSort(boolean isRPSSort) { need to give a better name to this property "sort" is redundant. "FullParallel" or "RangeParallel" may be? Line 112: return Objects.hash(modifierList, numFrames, numTuples, orderbyList); rangeMap and isRPSSort should be used in hashCode() Line 116: public boolean equals(Object object) { rangeMap and isRPSSort should be used in equals(). we don't really need an interface IRangeMap because we have 1 implementation. let's use that class instead and remove the interface. so equals() doesn't have to worry about different interface implementations https://asterix-gerrit.ics.uci.edu/#/c/2393/11/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj File asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj: Line 206: private static final String NO_PARALLEL_SORT_HINT = "no-rps-sort"; can we give this hint a better name? it's on the orderby clause, so "-sort" is redundant. may be "no-full-parallel" or "no-range-parallel"? Line 3048: oc.setRangeMap(RangeMapBuilder.parseHint(hint.substring(RANGE_HINT.length()), parserFactory)); why do we need to create and pass SqllppParserFactory when we're already inside the parser? let's change RangeMapBuilder.parseHint() to take IParser instance as an argument. then we create the parser here: new SQLPPParser(hint.substring(...)) Line 3050: throw new ParseException(e.getMessage()); need to add source location: throw SqlppParseException(getSourceLocation(getHintToken(token)), e.getMessage()) https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/expressions/AbstractLogicalExpression.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/expressions/AbstractLogicalExpression.java: Line 41: public void setParameters(Object[] expressionParameters) { why can't we use AbstractFunctionCallExpression.setOpaqueParameters() for this instead? and propagate them to FunctionDescriptor by implementing IFunctionTypeInferer for that function (BuiltinFunctions. GLOBAL_SAMPLING)? https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismOperatorVisitor.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismOperatorVisitor.java: Line 604: return true; need to implement this. look at other operators in this class. https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/properties/IPartitioningProperty.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/properties/IPartitioningProperty.java: Line 72: /** it seems that it should return a new object if there need to be any changes. i.e. it should not modify the existing object. let's remove this comment. https://asterix-gerrit.ics.uci.edu/#/c/2393/6/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java File hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java: Line 609: INodeDomain nd = pp.getNodeDomain(); looking for this pattern is not g https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java File hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java: Line 320: throw new IllegalStateException("Couldn't remove the redundant parallel sort and its subtree"); are we sure it can never happen? what if there're other rewriting rules that'd move operators around? should we log a warning here instead? it's a redundant sort, so the plan will be inefficient, but at least the query will run. Line 689: if (parentOp.getAnnotations().get(OperatorAnnotations.USE_RPS_SORT) == Boolean.TRUE looking for a specific operator pattern is not great. you just need to pass rangeMapKey around. can we introduce another annotation that'd hold it (RANGE_MAP_KEY)? So the rule that introduces the forward operator will annotate the sort operator with this annotation and we'll just read it here? https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/partition/range/FieldRangePartitionComputerFactory.java File hyracks-fullstack/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/partition/range/FieldRangePartitionComputerFactory.java: Line 31: public class FieldRangePartitionComputerFactory implements ITuplePartitionComputerFactory { may be we should split it into 3 classes: 1) AbstractFieldRangePartitionComputerFactory which just has a method getRangeMap(IHyracksTaskContext) which returns a RangeMap. This class will do all the work. 2) StaticFieldRangePartitionComputerFactory which takes RangeMap in the constructor and returns it in getRangeMap() 3) DynamicFieldRangePartitionComputerFactory which takes rangeMakeKeyInContext, and looks up the RangeMap dynamically. Line 72: throw HyracksDataException.create(ErrorCode.RANGEMAP_NOT_FOUND); pass source location https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/partition/range/RangeMap.java File hyracks-fullstack/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/partition/range/RangeMap.java: Line 29: * split_point_idx0 split_point_idx1 split_point_idx2 split_point_idx3 split_point_idx4 need better formatting. create html table or use <pre>? https://asterix-gerrit.ics.uci.edu/#/c/2393/11/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/base/AbstractReplicateOperatorDescriptor.java File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/base/AbstractReplicateOperatorDescriptor.java: Line 183: // TODO: shouldn't we fail the MaterializerTaskState state? there're no IFrameWriter associated with MaterializerTaskState, so I guess not -- To view, visit https://asterix-gerrit.ics.uci.edu/2393 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f Gerrit-PatchSet: 11 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Dmitry Lychagin <[email protected]> Gerrit-Reviewer: Ildar Absalyamov <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
