dosubot[bot] commented on issue #7826: URL: https://github.com/apache/incubator-devlake/issues/7826#issuecomment-2265291246
<!-- Greeting --> Hello, @wouldd! I'm a bot that can help you solve bugs, answer questions, and become a contributor. I'm here to assist you while you wait for a human maintainer. <!-- Answer --> The disappearance of Jira issues from the dataset in DevLake, especially for projects with a large number of issues, could be due to the way the `id` field is constructed and used as a primary key. According to the schema, the `id` for Jira issues is composed of several parts, including the Jira source ID and the Jira issue ID. If there are inconsistencies or collisions in these IDs, it could lead to issues not being properly recorded or retrieved. To resolve this, ensure that the `id` field is correctly and uniquely constructed for each Jira issue. Verify that the Jira source ID and Jira issue ID are correctly captured and concatenated to form a unique identifier for each issue. Additionally, check for any potential data truncation or transformation issues that might affect the uniqueness of the `id` field. Here is the relevant part of the schema for reference: ```md | `id` | varchar | 255 | An issue's `id` is composed of < plugin >:< Entity >:< PK0 >[:PK1]..." <ul><li>For Github issues, a Github issue's id is like "github:GithubIssues:< GithubIssueId >". Eg. 'github:GithubIssues:1049355647'</li> <li>For Jira issues, a Jira issue's id is like "jira:JiraIssues:< JiraSourceId >:< JiraIssueId >". Eg. 'jira:JiraIssues:1:10063'. < JiraSourceId > is used to identify which jira source the issue came from, since DevLake users can import data from several different Jira instances at the same time.</li></ul> | PK | ``` Ensure that the `JiraSourceId` and `JiraIssueId` are correctly and uniquely assigned to avoid any conflicts or data loss. Additionally, consider reviewing the domain layer schema and the data flow architecture to ensure that the data transformation processes are correctly handling the unique identifiers and not causing any data inconsistencies <sup>[[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.15/DataModels/DevLakeDomainLayerSchema.md)</sup><sup>[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.16/DataModels/DevLakeDomainLayerSchema.md)</sup>. Furthermore, if you are using webhooks, ensure that each project has its own webhook. DevLake determines the project an issue or deployment belongs to by segregating between the webhooks. If a webhook is used by multiple projects, all issues or deployments published by that webhook will be replicated among those projects, leading to potential data inconsistencies. Each project should have its own dedicated webhook to avoid such issues <sup>[[3]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Troubleshooting/Dashboard.md)</sup><sup>[[4]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Troubleshooting/Dashboard.md)</sup><sup>[[5]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.19/Troubleshooting/Dashboard.md)</sup>. For further troubleshooting, you can refer to the DORA Validation Dashboard in DevLake, which provides a step-by-step guide to check which step went wrong for all four DORA metrics. This can help you identify if the issue lies in the data mapping or querying process. Each chart in the dashboard has a hidden button in the top-right corner to access the context menu, where you can click `Edit` to view the SQL scripts behind the charts and understand how the data is queried <sup>[[3]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Troubleshooting/Dashboard.md)</sup><sup>[[4]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Troubleshooting/Dashboard.md)</sup><sup>[[5]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.19/Troubleshooting/Dashboard.md)</sup>. If none of these solutions resolve your problem, consider creating an issue on the DevLake GitHub repository for further assistance. <!-- Dosu Comment Footer --> *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org