This is an automated email from the ASF dual-hosted git repository.
abeizn pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-devlake-website.git
The following commit(s) were added to refs/heads/main by this push:
new 9236a0ecc docs: update pull_requests schema
9236a0ecc is described below
commit 9236a0ecc52a5a0e40403830e37da3847a0218a4
Author: Startrekzky <[email protected]>
AuthorDate: Fri Oct 14 20:31:11 2022 +0800
docs: update pull_requests schema
---
docs/DataModels/DevLakeDomainLayerSchema.md | 41 ++++++++++++++++-------------
docs/Metrics/LeadTimeForChanges.md | 3 +--
2 files changed, 24 insertions(+), 20 deletions(-)
diff --git a/docs/DataModels/DevLakeDomainLayerSchema.md
b/docs/DataModels/DevLakeDomainLayerSchema.md
index 23fd2483c..e202c99a4 100644
--- a/docs/DataModels/DevLakeDomainLayerSchema.md
+++ b/docs/DataModels/DevLakeDomainLayerSchema.md
@@ -42,7 +42,7 @@ Tables that end with WIP are still under development.
1. The name of a table is in plural form. Eg. boards, issues, etc.
2. The name of a table which describe the relation between 2 entities is in
the form of [BigEntity in singular form]\_[SmallEntity in plural form]. Eg.
board_issues, sprint_issues, pull_request_comments, etc.
-3. Value of the field in enum type are in capital letters. Eg.
[table.issues.type](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#ZDCw9k)
has 3 values, REQUIREMENT, BUG, INCIDENT. Values that are phrases, such as
'IN_PROGRESS' of
[table.issues.status](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#ZDCw9k),
are separated with underscore '\_'.
+3. Value of the field in enum type are in capital letters. Eg.
[table.issues.type](#issues) has 3 values, REQUIREMENT, BUG, INCIDENT. Values
that are phrases, such as 'IN_PROGRESS' of [table.issues.status](#issues), are
separated with underscore '\_'.
<br/>
@@ -72,7 +72,7 @@ An `issue` is the abstraction of Jira/Github/GitLab/TAPD/...
issues.
| **field** | **type** | **length** | **description**
[...]
| :-------------------------- | :------- | :---------
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
| `id` | varchar | 255 | An issue's `id` is
composed of < plugin >:< Entity >:< PK0 >[:PK1]..." <ul><li>For Github issues,
a Github issue's id is like "github:GithubIssues:< GithubIssueId >". Eg.
'github:GithubIssues:1049355647'</li> <li>For Jira issues, a Github repo's id
is like "jira:JiraIssues:< JiraSourceId >:< JiraIssueId >". Eg.
'jira:JiraIssues:1:10063'. < JiraSourceId > is used to identify which jira
source the issue came from, since DevLake users [...]
-| `issue_key` | varchar | 255 | The key of this issue.
For example, the key of this Github
[issue](https://github.com/merico-dev/lake/issues/1145) is 1145.
[...]
+| `issue_key` | varchar | 255 | The key of this issue.
For example, the key of this Github
[issue](https://github.com/apache/incubator-devlake/issues/1145) is 1145.
[...]
| `url` | varchar | 255 | The url of the issue.
It's a web address in most cases.
[...]
| `title` | varchar | 255 | The title of an issue
[...]
| `description` | longtext | | The detailed
description/summary of an issue
[...]
@@ -157,14 +157,14 @@ This table shows the work logged under issues. Usually,
an issue has multiple wo
#### boards
-A `board` is an issue list or a collection of issues. It's the abstraction of
a Jira board, a Jira project, a [GitHub issue
list](https://github.com/merico-dev/lake/issues) or a GitLab issue list. This
table can be used to filter issues by the boards they belong to.
+A `board` is an issue list or a collection of issues. It's the abstraction of
a Jira board, a Jira project, a [GitHub issue
list](https://github.com/apache/incubator-devlake/issues) or a GitLab issue
list. This table can be used to filter issues by the boards they belong to.
| **field** | **type** | **length** | **description**
| **key** |
| :------------- | :------- | :--------- |
:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| :------ |
| `id` | varchar | 255 | A board's `id` is composed of "<
plugin >:< Entity >:< PK0 >[:PK1]..." <ul><li>For a Github repo's issue list,
the board id is like "< github >:< GithubRepos >:< GithubRepoId >". Eg.
"github:GithubRepo:384111310"</li> <li>For a Jira Board, the id is like the
board id is like "< jira >:< JiraSourceId >< JiraBoards >:< JiraBoardsId >".
Eg. "jira:1:JiraBoards:12"</li></ul> | PK |
-| `name` | varchar | 255 | The name of the board. Note: the
board name of a Github project 'merico-dev/lake' is 'merico-dev/lake',
representing the [default issue
list](https://github.com/merico-dev/lake/issues).
|
|
+| `name` | varchar | 255 | The name of the board. Note: the
board name of a Github project 'apache/incubator-devlake' is
'apache/incubator-devlake', representing the [default issue
list](https://github.com/apache/incubator-devlake/issues).
| |
| `description` | varchar | 255 | The description of the board.
| |
-| `url` | varchar | 255 | The url of the board. Eg.
https://Github.com/merico-dev/lake
| |
+| `url` | varchar | 255 | The url of the board. Eg.
https://github.com/apache/incubator-devlake
| |
| `created_date` | datetime | 3 | Board creation time
| |
| `type` | varchar | 255 | Identify scrum and non-scrum board
|
|
@@ -183,14 +183,14 @@ A `sprint` is the abstraction of Jira sprints, TAPD
iterations and GitHub milest
| **field** | **type** | **length** | **description**
[...]
| :------------------ | :------- | :---------
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
-| `id` | varchar | 255 | A sprint's `id` is composed of
"< plugin >:< Entity >:< PK0 >[:PK1]..."<ul><li>A sprint in a Github repo is a
milestone, the sprint id is like "< github >:< GithubRepos >:< GithubRepoId >:<
milestoneNumber >".<br/>Eg. The id for this
[sprint](https://github.com/merico-dev/lake/milestone/5) is
"github:GithubRepo:384111310:5"</li><li>For a Jira Board, the id is like "<
jira >:< JiraSourceId >< JiraBoards >:< JiraBoardsId >".<br/>Eg. "jira:1:J [...]
-| `name` | varchar | 255 | The name of sprint.<br/>For
Github projects, the sprint name is the milestone name. For instance, 'v0.10.0
- Introduce Temporal to DevLake' is the name of this
[sprint](https://github.com/merico-dev/lake/milestone/5).
[...]
+| `id` | varchar | 255 | A sprint's `id` is composed of
"< plugin >:< Entity >:< PK0 >[:PK1]..."<ul><li>A sprint in a Github repo is a
milestone, the sprint id is like "< github >:< GithubRepos >:< GithubRepoId >:<
milestoneNumber >".<br/>Eg. The id for this
[sprint](https://github.com/apache/incubator-devlake/milestone/5) is
"github:GithubRepo:384111310:5"</li><li>For a Jira Board, the id is like "<
jira >:< JiraSourceId >< JiraBoards >:< JiraBoardsId >".<br/>Eg. [...]
+| `name` | varchar | 255 | The name of sprint.<br/>For
Github projects, the sprint name is the milestone name. For instance, 'v0.10.0
- Introduce Temporal to DevLake' is the name of this
[sprint](https://github.com/apache/incubator-devlake/milestone/5).
[...]
| `url` | varchar | 255 | The url of sprint.
[...]
| `status` | varchar | 255 | There're 3 statuses of a
sprint:<ul><li>CLOSED: a completed sprint</li><li>ACTIVE: a sprint started but
not completed</li><li>FUTURE: a sprint that has not started</li></ul>
[...]
| `started_date` | datetime | 3 | The start time of a sprint
[...]
| `ended_date` | datetime | 3 | The planned/estimated end time
of a sprint. It's usually set when planning a sprint.
[...]
| `completed_date` | datetime | 3 | The actual time to complete a
sprint.
[...]
-| `original_board_id` | datetime | 3 | The id of board where the
sprint first created. This field is not null only when this entity is
transformed from Jira sprints.<br/>In Jira, sprint and board entities have 2
types of relation:<ul><li>A sprint is created based on a specific board. In
this case, board(1):(n)sprint. The `original_board_id` is used to show the
relation.</li><li>A sprint can be mapped to multiple boards, a board can also
show multiple sprints. In this case, board [...]
+| `original_board_id` | datetime | 3 | The id of board where the
sprint first created. This field is not null only when this entity is
transformed from Jira sprints.<br/>In Jira, sprint and board entities have 2
types of relation:<ul><li>A sprint is created based on a specific board. In
this case, board(1):(n)sprint. The `original_board_id` is used to show the
relation.</li><li>A sprint can be mapped to multiple boards, a board can also
show multiple sprints. In this case, board [...]
#### sprint_issues
@@ -226,9 +226,9 @@ Information about GitHub or Gitlab repositories. A
repository is always owned by
| `id` | varchar | 255 | A repo's `id` is composed of "<
plugin >:< Entity >:< PK0 >[:PK1]..."<br/>For example, a Github repo's id is
like "< github >:< GithubRepos >< GithubRepoId >". Eg.
'github:GithubRepos:384111310' | PK |
| `name` | varchar | 255 | The name of repo.
|
|
| `description` | varchar | 255 | The description of repo.
|
|
-| `url` | varchar | 255 | The url of repo. Eg.
https://Github.com/merico-dev/lake
| |
+| `url` | varchar | 255 | The url of repo. Eg.
https://github.com/apache/incubator-devlake
| |
| `owner_id` | varchar | 255 | The id of the owner of repo
|
FK_accounts.id |
-| `language` | varchar | 255 | The major language of repo. Eg. The
language for merico-dev/lake is 'Go'
|
|
+| `language` | varchar | 255 | The major language of repo. Eg. The
language for apache/incubator-devlake is 'Go'
| |
| `forked_from` | varchar | 255 | Empty unless the repo is a fork in
which case it contains the `id` of the repo the repo is forked from.
|
|
| `deleted` | tinyint | 255 | 0: repo is active 1: repo has been
deleted
|
|
| `created_date` | datetime | 3 | Repo creation date
|
|
@@ -243,7 +243,7 @@ The table is filled in when the repo has been first
inserted on when an update r
| **field** | **type** | **length** | **description**
| **key** |
| :------------- | :------- | :--------- |
:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| :------ |
| `id` | varchar | 255 | A repo's `id` is composed of "<
plugin >:< Entity >:< PK0 >[:PK1]..."<br/>For example, a Github repo's id is
like "< github >:< GithubRepos >< GithubRepoId >". Eg.
'github:GithubRepos:384111310' | PK |
-| `language` | varchar | 255 | The language of repo.<br/>These are
the [languages](https://api.github.com/repos/merico-dev/lake/languages) for
merico-dev/lake
| |
+| `language` | varchar | 255 | The language of repo.<br/>These are
the
[languages](https://api.github.com/repos/apache/incubator-devlake/languages)
for apache/incubator-devlake
| |
| `bytes` | int | | The byte counts for all files in
those languages
| |
| `created_date` | datetime | 3 | The field is filled in with the
latest timestamp the query for a specific `repo_id` was done.
| |
@@ -263,7 +263,7 @@ A ref is the abstraction of a branch or tag.
| **field** | **type** | **length** | **description**
| **key**
|
| :----------- | :------- | :--------- |
:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| :---------- |
| `id` | varchar | 255 | A ref's `id` is composed of "< plugin
>:< Entity >:< PK0 >[:PK1]..."<br/>For example, a Github ref is composed of
"github:GithubRepos:< GithubRepoId >:< RefUrl >". Eg. The id of release v5.3.0
of PingCAP/TiDB project is 'github:GithubRepos:384111310:refs/tags/v5.3.0' A
repo's `id` is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..." |
PK |
-| `ref_name` | varchar | 255 | The name of ref. Eg.
'[refs/tags/v0.9.3](https://github.com/merico-dev/lake/tree/v0.9.3)'
| |
+| `ref_name` | varchar | 255 | The name of ref. Eg.
'[refs/tags/v0.9.3](https://github.com/apache/incubator-devlake/tree/v0.9.3)'
| |
| `repo_id` | varchar | 255 | The id of repo this ref belongs to
|
FK_repos.id |
| `commit_sha` | char | 40 | The commit this ref points to at the
time of collection
|
|
| `is_default` | int | | <ul><li>0: the ref is the default
branch. By the definition of
[Github](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-branches-in-your-repository/changing-the-default-branch),
the default branch is the base branch for pull requests and code
commits.</li><li>1: not the default branch</li></ul> | |
@@ -274,7 +274,7 @@ A ref is the abstraction of a branch or tag.
This table shows the commits added in a new ref compared to an old ref. This
table can be used to support tag-based analysis, for instance, '_No. of commits
of a tag_', '_No. of merged pull request of a tag_', etc.
-The records of this table are computed by
[RefDiff](https://github.com/merico-dev/lake/tree/main/plugins/refdiff) plugin.
The computation should be manually triggered after using
[GitRepoExtractor](https://github.com/merico-dev/lake/tree/main/plugins/gitextractor)
to collect commits and refs. The algorithm behind is similar to
[this](https://github.com/merico-dev/lake/compare/v0.8.0%E2%80%A6v0.9.0).
+The records of this table are computed by
[RefDiff](https://github.com/apache/incubator-devlake/tree/main/plugins/refdiff)
plugin. The computation should be manually triggered after using
[GitRepoExtractor](https://github.com/apache/incubator-devlake/tree/main/plugins/gitextractor)
to collect commits and refs. The algorithm behind is similar to
[this](https://github.com/apache/incubator-devlake/compare/v0.8.0%E2%80%A6v0.9.0).
| **field** | **type** | **length** | **description**
| **key** |
| :------------------- | :------- | :--------- |
:-------------------------------------------------------------- |
:------------- |
@@ -358,7 +358,7 @@ A pull request is the abstraction of GitHub pull request
and Gitlab merge reques
| `description` | longtext | | The body/description of pull
request
| |
| `status` | varchar | 255 | the status of pull requests.
For a Github pull request, the status can either be 'open' or 'closed'.
| |
| `parent_pr_id` | varchar | 255 | The id of the parent PR
| |
-| `pull_request_key` | varchar | 255 | The key of PR. Eg, 1536 is the
key of this [PR](https://github.com/merico-dev/lake/pull/1563)
| |
+| `pull_request_key` | varchar | 255 | The key of PR. Eg, 1536 is the
key of this [PR](https://github.com/apache/incubator-devlake/pull/1563)
| |
| `base_repo_id` | varchar | 255 | The repo that will be updated.
| |
| `head_reop_id` | varchar | 255 | The repo containing the changes
that will be added to the base. If the head repository is NULL, this means that
the corresponding project had been deleted when DevLake processed the pull
request.
| |
| `base_ref` | varchar | 255 | The branch name in the base
repo that will be updated
| |
@@ -372,8 +372,13 @@ A pull request is the abstraction of GitHub pull request
and Gitlab merge reques
| `merged_date` | datetime | 3 | The time PR gets merged. Null
when the PR is not merged.
| |
| `closed_date` | datetime | 3 | The time PR closed. Null when
the PR is not closed.
| |
| `merge_commit_sha` | char | 40 | the merge commit of this PR. By
the definition of
[Github](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-branches-in-your-repository/changing-the-default-branch),
when you click the default Merge pull request option on a pull request on
Github, all commits from the feature branch are added to the base branch in a
merge commit. | |
-| `base_commit_sha` | char | 40 | The base commit of this PR.
| |
-| `head_commit_sha` | char | 40 | The head commit of this PR.
| |
+| `base_commit_sha` | char | 40 | The base commit of this PR.
| |
+| `head_commit_sha` | char | 40 | The head commit of this PR.
| |
+| `coding_time_span`| bigint | | PR created_date - PR's first
commit's authored_date | |
+| `review_lag` | bigint | | PR's first comment time - PR's
created_date | |
+| `review_time_span`| bigint | | PR merged_date - PR's first
comment time | |
+| `deploy_timespan` | bigint | | PR deployed date - PR
merged_date | |
+| `change_timespan` | bigint | | PR cycle time, equals to
coding_time_span + review_lag + review_time_span + deploy_timespan |
|
#### pull_request_labels
@@ -493,7 +498,7 @@ A low-level mapping between "issue tracking" and "source
code management" domain
The original connection between these two entities lies in either issue
tracking tools like Jira or source code management tools like GitLab. You have
to use tools to accomplish this.
-For example, a common method to connect Jira issue and GitLab commit is a
GitLab plugin [Jira Integration](https://docs.gitlab.com/ee/integration/jira/).
With this plugin, the Jira issue key in the commit message written by the
committers will be parsed. Then, the plugin will add the commit urls under this
jira issue. Hence, DevLake's [Jira
plugin](https://github.com/merico-dev/lake/tree/main/plugins/jira) can get the
related commits (including repo, commit_id, url) of an issue.
+For example, a common method to connect Jira issue and GitLab commit is a
GitLab plugin [Jira Integration](https://docs.gitlab.com/ee/integration/jira/).
With this plugin, the Jira issue key in the commit message written by the
committers will be parsed. Then, the plugin will add the commit urls under this
jira issue. Hence, DevLake's [Jira
plugin](https://github.com/apache/incubator-devlake/tree/main/plugins/jira) can
get the related commits (including repo, commit_id, url) of an issue.
| **field** | **type** | **length** | **description** | **key** |
| :----------- | :------- | :--------- | :-------------- | :------------- |
@@ -576,7 +581,7 @@ DWM entities are the slight aggregation and operation of
DWD to store more organ
#### refs_issues_diffs
-This table shows the issues fixed by commits added in a new ref compared to an
old one. The data is computed from
[table.ref_commits_diff](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#yJOyqa),
[table.pull_requests](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#Uc849c),
[table.pull_request_commits](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#G9cPfj),
and
[table.pull_request_issues](https://merico.feishu.cn/docs/doccnvyuG9YpVc6lvmWkmmbZtUc#we6Uac).
+This table shows the issues fixed by commits added in a new ref compared to an
old one. The data is computed from
[table.ref_commits_diff](#refs_commits_diffs),
[table.pull_requests](#pull_requests),
[table.pull_request_commits](#pull_request_commits), and
[table.pull_request_issues](#pull_request_issues).
This table can support tag-based analysis, for instance, '_No. of bugs closed
in a tag_'.
diff --git a/docs/Metrics/LeadTimeForChanges.md
b/docs/Metrics/LeadTimeForChanges.md
index 5a38dd1c4..0c8dfc764 100644
--- a/docs/Metrics/LeadTimeForChanges.md
+++ b/docs/Metrics/LeadTimeForChanges.md
@@ -102,7 +102,7 @@ _clt as (
),
_calendar_months as(
--- to deal with the month with no incidents
+-- to deal with the month with no incidents
SELECT date_format(CAST((SYSDATE()-INTERVAL (month_index) MONTH) AS
date), '%y/%m') as month
FROM ( SELECT 0 month_index
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL
SELECT 3
@@ -129,7 +129,6 @@ If you want to measure in which category your team falls
into as the picture sho

```
--- Metric 2: median change lead time
with _pr_stats as (
-- get PRs' cycle time in the selected period
SELECT