Startrekzky commented on issue #8188: URL: https://github.com/apache/incubator-devlake/issues/8188#issuecomment-2503573757
Hi @kostas-petrakis , I saw your PR https://github.com/apache/incubator-devlake/pull/8206 one of DevLake's strategies of `PR-deployment_commit` mapping is to skip the mapping of the very first `deployment_commit` in DevLake. Why? Because in many cases, 1. Users will collect both `PRs` and `deployment_commits` within a selected timeframe. 2. When finding a PR's associated deployment_commit, we need to diff the commits between every consecutive deployment_commits, which means we don't know the `precise commits` deployed by the very first deployment_commit collected by DevLake. 3. Hence, we don't know the PRs deployed by the very first deployment_commit. In your case, the PR `github:GithubPullRequest:4:1806504909` might be deployed by the first deployement_commit `github:GithubRun:4:435823930:8567339272:https:\/\/github.com\/l*****\/****` for sure. However, if we allow the first deployment_commit mapped to PRs. If the `deploy time` between the prior deployment to the first deployment collected by DevLake, many irrelevant PRs merged during this period will be mis-mapped to the deployment commit (some other users had encountered the problem). Finally, we adopted the above strategy of excluding the very first deployment when mapping to PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org