robgutsopedra opened a new issue, #6457:
URL: https://github.com/apache/incubator-devlake/issues/6457

   ## What and why to refactor
   PR review depth and PR size take forever/don't work at all. 
   
   ## Describe the solution you'd like
   I assume they should work as any other panel within Grafana
   
   ## Related issues
   None that I have found
   
   ## Additional context
   I have been focusing my efforst on PR review depth
   
   I was using v0.18.0 query, but also tried query at main branch (which I 
believe are the same). For context, this: 
   ```
   SELECT
     DATE_ADD(date(pr.created_date), INTERVAL 
-$interval(date(pr.created_date))+1 DAY) as time,
     count(distinct prc.id)/count(distinct pr.id) as "PR Review Depth"
   FROM 
     pull_requests pr
     left join pull_request_comments prc on pr.id = prc.pull_request_id
     join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 
'repos' 
   WHERE
     $__timeFilter(pr.created_date)
     and pm.project_name in ($project)
     and pr.merged_date is not null
   group by 1
   ```
   
   I also tried a solution proposed by @Startrekzky -- 
   ```
   with t1 as(
      select 
          pull_request_id, count(*) as pr_comment_number
      from
          pull_request_comments
      group by 1
   )
   
   
   SELECT
     DATE_ADD(date(pr.created_date), INTERVAL 
-$interval(date(pr.created_date))+1 DAY) as time,
     sum(t1.pr_comment_number)/count(distinct pr.id) as "PR Review Depth"
   FROM 
     pull_requests pr
     left join t1 on t1.pull_request_id = pr.id
     join project_mapping pm on pr.base_repo_id = pm.row_id and pm.table = 
'repos' 
   WHERE
     $__timeFilter(pr.created_date)
     and pm.project_name in ($project)
     and pr.merged_date is not null
   group by 1
   ```
   
   Which _I think_ it's working better, but still far from perfect. 
   
   In _my_ case, I have around 3600 PRs, 23254 PR comments - 8 * 3600 * 23254 = 
669M. I understand it might be big, but I think devLake handles heavier 
situations than this. Am I the only one getting degraded performance on these 
queries? 
   
   Anyway, I'm far from an SQL expert but I'm more than happy to test/try 
anything you throw my way! 
   
   Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to