[GitHub] [incubator-doris] morrySnow commented on a change in pull request #8695: [enhancement] update broadcast join cost algorithm

GitBox Wed, 30 Mar 2022 04:36:19 -0700


morrySnow commented on a change in pull request #8695:
URL: https://github.com/apache/incubator-doris/pull/8695#discussion_r838438250




##########
File path: 
fe/fe-core/src/main/java/org/apache/doris/planner/JoinCostEvaluation.java
##########
@@ -147,7 +149,7 @@ public long constructHashTableSpace() {
                 Math.pow(1.5, (int) ((Math.log((double) 
rhsTreeCardinality/4096) / Math.log(1.5)) + 1)) * 4096;
         double nodeOverheadSpace = nodeArrayLen * 16;
         double nodeTuplePointerSpace = nodeArrayLen * rhsTreeTupleIdNum * 8;
-        return Math.round((bucketPointerSpace + (double) rhsTreeCardinality * 
rhsTreeAvgRowSize
+        return Math.round((bucketPointerSpace + (double) rhsTreeCardinality * 
rhsTreeAvgRowSize * COMPRESSION_RATIO

Review comment:
       > I checked the current average row size of the code into the following 
cases:
   > 
   > 1. OlapScanNode: avgRowSize=totalBytes (compressed) / cardinality
   > 2. Agg or Sort: avgRowSize= the actual sum of the size of each column
   >    (not sure if there are other possible values)
   > 
   > So
   > 
   > 1. Is it possible to use the actual row size directly into the current 
formula to match the actual consumption? If yes, then I'm in favor of changing 
the row size. If not, are there other problems with the current formula?
   > 2. Some operators do not actually use the compressed size, so it is best 
to modify the average size of the row in PlanNode.computeStats()
   
   I agree with u. This is same suggestion with 
@[xinyiZzz](https://github.com/xinyiZzz). 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-doris] morrySnow commented on a change in pull request #8695: [enhancement] update broadcast join cost algorithm

Reply via email to