YenchangChan opened a new issue, #28654:
URL: https://github.com/apache/doris/issues/28654

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   2.0.3
   
   ### What's Wrong?
   
   本人初次接触Doris,使用routine load从kafka导入数据到Doris。查看Statistic时,得到如下数据:
   ```json
   {
        "receivedBytes": 129215158726,
        "runningTxns": [],
        "errorRows": 57,
        "committedTaskNum": 720,
        "loadedRows": 201594590,
        "loadRowsRate": 41912,
        "abortedTaskNum": 0,
        "errorRowsAfterResumed": 0,
        "totalRows": 201594647,
        "unselectedRows": 0,
        "receivedBytesRate": 26864584,
        "taskExecuteTimeMs": 4809870
   }
   ```
   
实际我的数据27分钟就已经处理完了,但是显示的taskExecuteTimeMs却是4809870毫秒,也就是81分钟。导致计算出来的receivedBytesRate和loadRowsRate都严重低于实际值。
   翻阅代码,计算taskExecuteTimeMs的逻辑如下:
   ```java
   private void updateNumOfData(long numOfTotalRows, long numOfErrorRows, long 
unselectedRows, long receivedBytes,
                                    long taskExecutionTime, boolean isReplay) 
throws UserException {
           this.jobStatistic.totalRows += numOfTotalRows;
           this.jobStatistic.errorRows += numOfErrorRows;
           this.jobStatistic.unselectedRows += unselectedRows;
           this.jobStatistic.receivedBytes += receivedBytes;
           this.jobStatistic.totalTaskExcutionTimeMs += taskExecutionTime;
           
           ...
   }
   ```
   
totalRows需要到BE各个节点上去统计,然后sum是可以理解的,但是时间却不应该将每个BE的时间进行累加。因为我有3个BE,所以实际上这个时间计算了三遍。因此实际时间应该是
  4809870 / 3 = 1603290, 折算下来是26.7分钟,这与实际耗时是接近的。
   个人认为,此处应该取每个BE上的最大值即可,而不是累加:
   ```java
       this.jobStatistic.totalTaskExcutionTimeMs = 
Math.max(this.jobStatistic.totalTaskExcutionTimeMs, taskExecutionTime);
   ```
   此处由于执行时间被累加,导致膨胀了三倍,带来的直接影响是计算导入速度不准了,实际上可以达到12.6w/s,但是显示的却只有4.2w/s。
   
   ### What You Expected?
   
   指标统计正确。
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to