morrySnow commented on a change in pull request #8695:
URL: https://github.com/apache/incubator-doris/pull/8695#discussion_r838438250
##########
File path:
fe/fe-core/src/main/java/org/apache/doris/planner/JoinCostEvaluation.java
##########
@@ -147,7 +149,7 @@ public long constructHashTableSpace() {
Math.pow(1.5, (int) ((Math.log((double)
rhsTreeCardinality/4096) / Math.log(1.5)) + 1)) * 4096;
double nodeOverheadSpace = nodeArrayLen * 16;
double nodeTuplePointerSpace = nodeArrayLen * rhsTreeTupleIdNum * 8;
- return Math.round((bucketPointerSpace + (double) rhsTreeCardinality *
rhsTreeAvgRowSize
+ return Math.round((bucketPointerSpace + (double) rhsTreeCardinality *
rhsTreeAvgRowSize * COMPRESSION_RATIO
Review comment:
> I checked the current average row size of the code into the following
cases:
>
> 1. OlapScanNode: avgRowSize=totalBytes (compressed) / cardinality
> 2. Agg or Sort: avgRowSize= the actual sum of the size of each column
> (not sure if there are other possible values)
>
> So
>
> 1. Is it possible to use the actual row size directly into the current
formula to match the actual consumption? If yes, then I'm in favor of changing
the row size. If not, are there other problems with the current formula?
> 2. Some operators do not actually use the compressed size, so it is best
to modify the average size of the row in PlanNode.computeStats()
I agree with u. This is same suggestion with
@[xinyiZzz](https://github.com/xinyiZzz).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]