[jira] [Commented] (DRILL-6381) Add capability to do index based planning and execution

ASF GitHub Bot (JIRA) Thu, 25 Oct 2018 15:36:55 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664362#comment-16664362
 ]


ASF GitHub Bot commented on DRILL-6381:
---------------------------------------

amansinha100 commented on a change in pull request #1466: DRILL-6381: Add 
support for index based planning and execution
URL: https://github.com/apache/drill/pull/1466#discussion_r228357573
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/BroadcastExchangePrel.java
 ##########
 @@ -57,7 +57,9 @@ public RelOptCost computeSelfCost(RelOptPlanner planner, 
RelMetadataQuery mq) {
 
     final int  rowWidth = child.getRowType().getFieldCount() * 
DrillCostBase.AVG_FIELD_WIDTH;
     final double cpuCost = broadcastFactor * DrillCostBase.SVR_CPU_COST * 
inputRows;
-    final double networkCost = broadcastFactor * 
DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth * numEndPoints;
+
+    //we assume localhost network cost is 1/10 of regular network cost
+    final double networkCost = broadcastFactor * 
DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth * (numEndPoints - 0.9);
 
 Review comment:
   @gparai,  forgot to respond to this.  The cost formula is: 
       (cost of broadcasting num_bytes to N - 1 nodes)  +  (cost of local 
broadcast to all minor fragments on my own node)
      = (C * num_bytes * (N - 1) )  +  (C * num_bytes * 0.1)   where the 0.1 
factor comes from the assumption that local broadcast is 10% network cost of 
the remote broadcast. 
     = C * num_bytes * (N - 0.9)
   
   While the formula seems reasonable, it is biasing the cost in favor of 
Broadcast compared to HashPartition.  We should re-visit this and ideally a 
similar change should be done for HashPartition also.  I don't recall the exact 
use case which motivated the change.. it may have been the Index Intersection.  
I can create a JIRA to re-visit this. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Add capability to do index based planning and execution
> -------------------------------------------------------
>
>                 Key: DRILL-6381
>                 URL: https://issues.apache.org/jira/browse/DRILL-6381
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Execution - Relational Operators, Query Planning &amp; 
> Optimization
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>             Fix For: 1.15.0
>
>
> If the underlying data source supports indexes (primary and secondary 
> indexes), Drill should leverage those during planning and execution in order 
> to improve query performance.  
> On the planning side, Drill planner should be enhanced to provide an 
> abstraction layer which express the index metadata and statistics.  Further, 
> a cost-based index selection is needed to decide which index(es) are 
> suitable.  
> On the execution side, appropriate operator enhancements would be needed to 
> handle different categories of indexes such as covering, non-covering 
> indexes, taking into consideration the index data may not be co-located with 
> the primary table, i.e a global index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6381) Add capability to do index based planning and execution

Reply via email to