Ali Alsuliman has posted comments on this change. Change subject: [ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization ......................................................................
Patch Set 13: (16 comments) https://asterix-gerrit.ics.uci.edu/#/c/2393/13/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalSamplingAggregateDescriptor.java: Line 173: ByteArrayInputStream byteStream = new ByteArrayInputStream(localSamples.getByteArray(), > don't create new instance for each tuple. use ByteArrayAccessibleInputStrea ByteArrayAccessibleInputStream is not available at asterix-runtime. It's in asterix-external-data. Any idea how to go about this? Line 175: AOrderedList localSamplesList = listSerde.deserialize(new DataInputStream(byteStream)); > don't create DateInputStream instance for each tuple. create it once with B ByteArrayAccessibleInputStream is not available at asterix-runtime. It's in asterix-external-data. Any idea how to go about this? https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/base/OperatorAnnotations.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/base/OperatorAnnotations.java: Line 25: public static final String USE_RANGE_CONNECTOR = "USE_RANGE_CONNECTOR"; // --> > should we rename USE_RANGE_CONECTOR to USE_STATIC_RANGE and rename the valu Done https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/AbstractStableSortPOperator.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/AbstractStableSortPOperator.java: Line 82: if (sortOp.getAnnotations().get(OperatorAnnotations.USE_DYNAMIC_RANGE) == Boolean.TRUE) { > should the same be required for USE_RANGE_CONNECTOR annotation? We know the Check my other comment related to this. Probably we should talk about this. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/utils/DotFormatGenerator.java File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/utils/DotFormatGenerator.java: Line 52: private static final LogicalOperatorDotVisitor dotVisitor = new LogicalOperatorDotVisitor(); > why was it made static? this could be a problem if we want to call it concu Well, it's not meant for concurrent use. But I made it non-static now just in case. https://asterix-gerrit.ics.uci.edu/#/c/2393/13/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java File hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java: Line 117: private static final FunctionIdentifier LOCAL_SAMPLING_FUN = AlgebricksBuiltinFunctions.LOCAL_SAMPLING; > just a thought. you can pass these two function identifiers through the con Done Line 144: debug(">>>> Optimizing operator " + op.getPhysicalOperator() + ".\n"); > the idea behind calling Logger.isDebugEnabled() is to avoid string concaten Sure. Done. Line 165: trace(STR_PROP_FOR + op.getPhysicalOperator() + ": " + op.getDeliveredPhysicalProperties() + "\n"); > same issue with potentially unnecessary message construction. revert to usi Done Line 170: private boolean physOptimizeOp(Mutable<ILogicalOperator> opRef, IPhysicalPropertiesVector required, > minor. can you put this method back after getStartChildIndex()? This would Done. I reverted all clean-up/renaming/restructuring code to make it even easier for reviewing. This rule is really messy. Line 611: if (parentOp.getAnnotations().containsKey(OperatorAnnotations.USE_RANGE_CONNECTOR)) { > I think that we once we start requiring OrderedPartitionedProperty for chil I didn't want to change the old code which handles the static range map since I have some concerns about its implementation. In fact, I'm not sure if the implementation of static range map is correct. Take a look at RangePartitionMergeExchangePOperator. I have no history of that change, so I don't know why it was done this way. My thought was to leave discussing/changing it for another change since the parallel sort change keeps growing. If you prefer, I can include it in the parallel sort change, but we have to discuss some of the issues I found. Line 666: // options for range partitioning: 1. range map computed at run time, 2. no range map?? > I think the two options are: 1) range map is known statically (USE_RANGE_C See my other comment related to this in createMergingConnector. Also, I'm not sure if line 671 is ever executed (this is old code): return new RangePartitionExchangePOperator(partitioningColumns, domain, null); Line 728: ReplicateOperator replicateOperator = new ReplicateOperator(2); > set source location (take if from the parent operator) Done. Please, check if correct or I missed anything. Line 750: private void createAggregateFunction(IOptimizationContext context, List<LogicalVariable> localResultVariables, > set source locations for all created expressions and operators Done Line 796: AggregateOperator aggregateOperator = new AggregateOperator(resultVariables, expressions); > set source location Done Line 810: private static ExchangeOperator createOneToOneExchangeOp(MutableObject<ILogicalOperator> inputOperator, > IsolateHyracksOperatorsRule.insertOneToOneExchange() has almost identical c Not done yet. The call to "computeDeliveredPhysicalProperties" in IsolateHyracksOperatorsRule would crash here in this rule at this point in time. Will see if there is a neat way to get this in the next round of your comments/feedback. Line 824: ForwardOperator forwardOperator = new ForwardOperator(rangeMapKey, rangeMapVariable); > set source location Done -- To view, visit https://asterix-gerrit.ics.uci.edu/2393 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f Gerrit-PatchSet: 13 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Dmitry Lychagin <[email protected]> Gerrit-Reviewer: Ildar Absalyamov <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
