[ 
https://issues.apache.org/jira/browse/DRILL-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362860#comment-16362860
 ] 

ASF GitHub Bot commented on DRILL-6115:
---------------------------------------

Github user vrozov commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1110#discussion_r167956002
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 ---
    @@ -20,133 +20,41 @@
     import com.google.common.collect.Lists;
     
     import org.apache.drill.exec.planner.physical.ExchangePrel;
    -import org.apache.drill.exec.planner.physical.HashPrelUtil;
    -import 
org.apache.drill.exec.planner.physical.HashPrelUtil.HashExpressionCreatorHelper;
    -import org.apache.drill.exec.planner.physical.HashToRandomExchangePrel;
     import org.apache.drill.exec.planner.physical.PlannerSettings;
     import org.apache.drill.exec.planner.physical.Prel;
    -import org.apache.drill.exec.planner.physical.ProjectPrel;
    -import 
org.apache.drill.exec.planner.physical.DrillDistributionTrait.DistributionField;
    -import org.apache.drill.exec.planner.physical.UnorderedDeMuxExchangePrel;
    -import org.apache.drill.exec.planner.physical.UnorderedMuxExchangePrel;
    -import org.apache.drill.exec.planner.sql.DrillSqlOperator;
     import org.apache.drill.exec.server.options.OptionManager;
     import org.apache.calcite.rel.RelNode;
    -import org.apache.calcite.rel.type.RelDataType;
    -import org.apache.calcite.rel.type.RelDataTypeField;
    -import org.apache.calcite.rex.RexBuilder;
    -import org.apache.calcite.rex.RexNode;
    -import org.apache.calcite.rex.RexUtil;
    -
    -import java.math.BigDecimal;
    -import java.util.Collections;
     import java.util.List;
     
     public class InsertLocalExchangeVisitor extends BasePrelVisitor<Prel, 
Void, RuntimeException> {
    -  private final boolean isMuxEnabled;
    -  private final boolean isDeMuxEnabled;
    -
    -
    -  public static class RexNodeBasedHashExpressionCreatorHelper implements 
HashExpressionCreatorHelper<RexNode> {
    -    private final RexBuilder rexBuilder;
    +  private final OptionManager options;
     
    -    public RexNodeBasedHashExpressionCreatorHelper(RexBuilder rexBuilder) {
    -      this.rexBuilder = rexBuilder;
    -    }
    -
    -    @Override
    -    public RexNode createCall(String funcName, List<RexNode> inputFields) {
    -      final DrillSqlOperator op =
    -          new DrillSqlOperator(funcName, inputFields.size(), true, false);
    -      return rexBuilder.makeCall(op, inputFields);
    +  private static boolean isMuxEnabled(OptionManager options) {
    +    if 
(options.getOption(PlannerSettings.MUX_EXCHANGE.getOptionName()).bool_val ||
    --- End diff --
    
    use `return` instead of `if`


> SingleMergeExchange is not scaling up when many minor fragments are allocated 
> for a query.
> ------------------------------------------------------------------------------------------
>
>                 Key: DRILL-6115
>                 URL: https://issues.apache.org/jira/browse/DRILL-6115
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.12.0
>            Reporter: Hanumath Rao Maduri
>            Assignee: Hanumath Rao Maduri
>            Priority: Major
>             Fix For: 1.13.0
>
>         Attachments: Enhancing Drill to multiplex ordered merge exchanges.docx
>
>
> SingleMergeExchange is created when a global order is required in the output. 
> The following query produces the SingleMergeExchange.
> {code:java}
> 0: jdbc:drill:zk=local> explain plan for select L_LINENUMBER from 
> dfs.`/drill/tables/lineitem` order by L_LINENUMBER;
> +------+------+
> | text | json |
> +------+------+
> | 00-00 Screen
> 00-01 Project(L_LINENUMBER=[$0])
> 00-02 SingleMergeExchange(sort0=[0])
> 01-01 SelectionVectorRemover
> 01-02 Sort(sort0=[$0], dir0=[ASC])
> 01-03 HashToRandomExchange(dist0=[[$0]])
> 02-01 Scan(table=[[dfs, /drill/tables/lineitem]], 
> groupscan=[JsonTableGroupScan [ScanSpec=JsonScanSpec 
> [tableName=maprfs:///drill/tables/lineitem, condition=null], 
> columns=[`L_LINENUMBER`], maxwidth=15]])
> {code}
> On a 10 node cluster if the table is huge then DRILL can spawn many minor 
> fragments which are all merged on a single node with one merge receiver. 
> Doing so will create lot of memory pressure on the receiver node and also 
> execution bottleneck. To address this issue, merge receiver should be 
> multiphase merge receiver. 
> Ideally for large cluster one can introduce tree merges so that merging can 
> be done parallel. But as a first step I think it is better to use the 
> existing infrastructure for multiplexing operators to generate an OrderedMux 
> so that all the minor fragments pertaining to one DRILLBIT should be merged 
> and the merged data can be sent across to the receiver operator.
> On a 10 node cluster if each node processes 14 minor fragments.
> Current version of code merges 140 minor fragments
> the proposed version has two level merges 1 - 14 merge in each drillbit which 
> is parallel 
> and 10 minorfragments are merged at the receiver node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to