Dmitry Lychagin has posted comments on this change.

Change subject: [ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization
......................................................................


Patch Set 13:

(13 comments)

done. please take a look and let's discuss at some point.

https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/CheckFullParallelSortRule.java:

Line 89:         if (clusterNodeDomain.cardinality() == null || 
clusterNodeDomain.cardinality() <= 1) {
can these checks be done in 
AbstractStableSortPOperator.getRequiredPropertiesForChildren()?
Why should treat USE_DYNAMIC_RANGE as a hint there and only require 
OrderedPartitionedProperty there if this hint is set and these conditions are 
satisfied.


Line 102:             if (parent.getOperatorTag() == LogicalOperatorTag.LIMIT 
&& ((LimitOperator) parent).isTopmostLimitOp()) {
These checks seem to be requirements that top-most limit and running aggreagte 
operators should place on their children properties. So top-most limit should 
require UNPARTITIONED input. The we could check parent requirements in 
AbstractStableSortPOperator.getRequiredPropertiesForChildren() and decide 
whether to introduce full parallel sort there or not. This approach is more 
generic because it propagates requirements through the plan and other operators 
could add their own requirements. Can we do all this and eliminate this rule?


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineAllNtsInSubplanVisitor.java:

Line 648:         // TODO(ali): check this
let's just throw a compilation exception here. same as 
visitLeftOuterUnnestMapOperator() does. this visitor should not be called after 
the property enforcement rule.


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/SubplanSpecialFlatteningCheckVisitor.java:

Line 235:         return visitInputs(op);
throw an error here? there should not be any forward operators in the plan at 
this point


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/translator/LangExpressionToPlanTranslator.java:

Line 1174:             // TODO(ali): would it be better to use a variable 
instead of annotation? that way whenever someone creates
here's an idea that we should discuss. What if we move no-full-parallel hint to 
the statement level, like we have "noindexonly" 
(AbstractIntroduceAccessMethodRule.NO_INDEX_ONLY_PLAN_OPTION). So by default we 
consider all orderbys to be full parallel with dynamic range (or static range 
if the range was provided as a hint on the order by clause). Just because we 
consider them full-parallel does not mean that they will become full parallel 
because they'll be subject to additional checks that are performed by 
CheckFullParallelSortRule. Also users can override this behavior by setting 
"nofullparallelsort" hint at the statement level. In that case we'll never 
consider them full parallel so they'll do global merge as before this feature. 
It's unlikely that users will need to control fullparallel per each orderbby 
clause, so per-statement setting should be sufficient. With this approach we 
don't need "unpartitioned" annotation in the parser,  
OrderbyClause.isFullParallel flag, and USE_DYNAMI
 C_RANGE annotation. We'll only need USE_RANGE_CONNECTOR annotation (that I 
suggested to rename as USE_STATIC_RANGE) and "nofullparallelsort" hint at the 
query level (need to decide on the name). What do you think?


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java
File 
asterixdb/asterix-app/src/main/java/org/apache/asterix/app/resource/RequiredCapacityVisitor.java:

Line 310:         visitInternal(op, true);
Forward is a blocking operator, should we call 
calculateMemoryUsageForBlockingOperators() here?


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java:

Line 173:             ByteArrayInputStream byteStream = new 
ByteArrayInputStream(localSamples.getByteArray(),
> ByteArrayAccessibleInputStream is not available at asterix-runtime. It's in
let's move ByteArrayAccessibleInputStream to org.apache.hyracks.data.std.util


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalSamplingAggregateDescriptor.java:

Line 77:         private static final int NUM_SAMPLES = 100;
have we thought about making it configurable (probably in a separate change)?


Line 113:                 System.arraycopy(inputFieldValue.getByteArray(), 
inputFieldValue.getStartOffset(), fieldValue, 0,
can we just produce the output right here at each step (using 
IAsterixListBuilder) instead of copying the data into java List and serializing 
in finish()?


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java
File 
hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/ForwardOperator.java:

Line 45:     private LogicalVariable rangeMapVariable;
minor, just for consistency. An operator assigns variables and consumes 
expressions. So instead of rangeMapVariable field it should be 
Mutable<ILogicalExpression> rangeMapExpr. And instead of passing a variable in 
the constructor we should pass VariableReferenceExpression(rangeMapVar) and 
modify acceptExpressionTransform() accordingly


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java
File 
hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java:

Line 611:             if 
(parentOp.getAnnotations().containsKey(OperatorAnnotations.USE_RANGE_CONNECTOR))
 {
> I didn't want to change the old code which handles the static range map sin
sure. let's discuss. I think it makes sense to unify them at this point. if 
that implementation is not correct then it can be fixed separately.


Line 666:         // options for range partitioning: 1. range map computed at 
run time, 2. no range map??
> See my other comment related to this in createMergingConnector. Also, I'm n
right, I also doubt that it was ever executed. but if we have a static map at 
this point then we can just pass it to the RangePartitionExchangePOperator


https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java
File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/collectors/SequentialMergeFrameReader.java:

Line 69:             sender.close();
what if one sender's close() fails. should we continue closing the remaining 
ones?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/2393
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f
Gerrit-PatchSet: 13
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Dmitry Lychagin <[email protected]>
Gerrit-Reviewer: Ildar Absalyamov <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to