Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2588898700 To configure the AE plugin in DevLake, follow these steps: 1. **Create a Connection**: Use the `PostConnections` endpoint to create a new AE connection. You need to send a JSON body with the connection details. ```go // @Summary create ae connection // @Description Create AE connection // @Tags plugins/ae // @Param body body models.AeConnection true "json body" // @Success 200 {object} models.AeConnection "Success" // @Failure 400 {string} errcode.Error "Bad Request" // @Failure 500 {string} errcode.Error "Internal Error" // @Router /plugins/ae/connections [POST] ``` 2. **Test the Connection**: After creating the connection, test it using the `TestConnection` endpoint to ensure it is set up correctly. ```go // TestConnection test ae connection // @Summary test ae connection // @Description Test AE Connection // @Tags plugins/ae // @Param body body models.AeConn true "json body" // @Success 200 {object} shared.ApiBody "Success" // @Failure 400 {string} errcode.Error "Bad Request" // @Failure 500 {string} errcode.Error "Internal Error" // @Router /plugins/ae/test [POST] ``` 3. **Manage Connections**: You can list, get details, update, or delete connections using the respective endpoints (`ListConnections`, `GetConnection`, `PatchConnection`, `DeleteConnection`). These steps should help you configure the AE plugin in DevLake [[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/ae/api/connection.go). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* --- Help Dosu learn! Give it feedback: [Great Response](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=great_response) | [Irrelevant Answer](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=irrelevant_answer) | [Incorrect Sources](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=incorrect_sources) | [Too Verbose](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=hallucination) | [Other](https://app.dosu.dev/response-feedback/a3bee1df-f1be-4609-87c2-3b67650340ad?feedback_type=other) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2533506282 @dosu The function of convertSprintIssues plugin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2533522520 To improve the memory usage of DevLake, you can increase the `innodb_buffer_pool_size` for MySQL. This can be done by adding the following line to the MySQL container in your `docker-compose` file: ```yaml mysql: image: mysql:8. ... # add the following line to the mysql container command: --innodb-buffer-pool-size=200M ``` This adjustment helps manage memory usage by allocating more memory to the InnoDB buffer pool, which can be beneficial when dealing with large datasets. Additionally, this change can help prevent errors related to the total number of locks exceeding the lock table size, which can occur when purging expired data or handling large repositories or boards [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.18/Troubleshooting/Configuration.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.17/Troubleshooting/Configuration.md)[[3]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.15/Troubleshooting/Configuration.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* --- Help Dosu learn! Give it feedback: [Great Response](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=great_response) | [Irrelevant Answer](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=irrelevant_answer) | [Incorrect Sources](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=incorrect_sources) | [Too Verbose](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=hallucination) | [Other](https://app.dosu.dev/response-feedback/29a60a1d-8537-4862-b8a3-644bd73ba592?feedback_type=other) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2533507318 The function of the `convertSprintIssues` plugin in DevLake is to convert Jira sprint issues into a format that can be used within the DevLake platform. It selects all issues that belong to a specific board and converts each issue into a `SprintIssue` object with generated domain IDs for both the sprint and the issue. This conversion process is facilitated by a data converter that processes each input row (representing a Jira sprint issue) and outputs the corresponding `SprintIssue` object. The function is part of the Jira plugin and is enabled by default to handle data related to tickets [[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/jira/tasks/sprint_issues_convertor.go). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* --- Help Dosu learn! Give it feedback: [Great Response](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=great_response) | [Irrelevant Answer](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=irrelevant_answer) | [Incorrect Sources](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=incorrect_sources) | [Too Verbose](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=hallucination) | [Other](https://app.dosu.dev/response-feedback/1ee8d234-086b-4c5d-a3aa-ad14b37fa1e6?feedback_type=other) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2481770675 To optimize the performance of the "convert users" and "extract users" processes in the GitLab plugin of DevLake, you can adjust the rate limit configuration. By default, DevLake uses a dynamic rate limit of around 12,000 requests per hour for GitLab, but you can increase this to a maximum of 120,000 requests per hour for GitLab Cloud, depending on your specific needs and constraints [[1]](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx). This adjustment can help speed up data collection and processing times. *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2481769723 @dosu gtilab convert users and extract users takes a long time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2475637722 The message field in the commit table is displaying garbled text for Chinese characters because the migration script modifies the `message` column to use `LONGTEXT CHARACTER SET binary`. This character set does not support proper encoding for Chinese characters, leading to garbled text. The `binary` character set treats the text as raw bytes, which is not suitable for multi-byte character sets like those used for Chinese characters [[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/sonarqube/models/migrationscripts/20240325_modify_commit_character_type.go). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2475636125 @dosu The message field in the commit table for data collection is garbled in Chinese -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469843563 To configure tasks to run in parallel in DevLake, you can utilize the pipeline structure, which allows tasks within the same stage to be executed in parallel. Here's how it works: 1. **Pipeline Structure**: A pipeline is organized as a two-dimensional array of tasks. Tasks within the same stage (e.g., stage1, stage2) are executed in parallel. 2. **Task Execution**: Tasks in each stage run concurrently, and the execution of tasks in subsequent stages depends on the completion of tasks in previous stages. 3. **Subtask Execution**: Within each plugin, subtasks are executed sequentially after the parallel execution of tasks in the same stage. This setup allows for efficient parallel processing of tasks while maintaining dependencies across different stages [[1]](https://github.com/apache/incubator-devlake-website/blob/main/blog/2022-06-17-How%20DevLake%20is%20up%20and%20running/index.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469865048 To configure tasks to run in parallel within a stage in the DevLake configuration file, you need to define a single stage with multiple tasks in a 2D array format. Each stage is represented by an array, and tasks within the same stage run in parallel. Here is an example configuration for running a GitLab task and a Jira task in parallel within a single stage: ```json [ [ { "plugin": "jira", "options": { "boardId": 8, "connectionId": 1 } }, { "plugin": "gitlab", "options": { "connectionId": 1, "projectId": 116***98 } } ] ] ``` In this configuration, both the Jira and GitLab tasks will run in parallel because they are defined within the same stage (the same inner array) [[1]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Configuration/AdvancedMode.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469823127 @dosu ,How to run tasks in parallel within a stage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469826020 Tasks can be run in parallel within a stage in DevLake. A stage is a collection of tasks performed by data plugins, and these tasks are executed in parallel within any stage. This parallel execution allows for efficient processing of data collection, extraction, conversion, and enrichment jobs. The tasks from different plugins can be executed concurrently within the same stage [[1]](https://github.com/apache/incubator-devlake-website/blob/main/blog/2022-06-17-How%20DevLake%20is%20up%20and%20running/index.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.15/Overview/KeyConcepts.md)[[3]](https://github.com/apache/incubator-devlake/blob/main/backend/server/services/blueprint_test.go). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469732633 Could you provide more details about the specific software or tool you are referring to with versions v1.0.0 and v1.0.1? This will help me give you a more accurate comparison. *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2469730888 @dosu The difference between v1.0.0 and v1.0.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2464168434 You can configure the Data Collection Modes in DevLake through the Config UI. To access the Config UI, visit [http://localhost:4000](http://localhost:4000). Once there, you can choose the desired mode for collecting data as part of Step 2.4 in the data collection process. DevLake supports three data collection modes: "Collect Data (Default)," "Collect Data in Full Refresh Mode," and "Re-transform Data" [[1]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Configuration/Tutorial.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v1.0/Configuration/Tutorial.md)[[3]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.20/Configuration/Tutorial.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2464165003 @dosu Where to configure Data Collection Modes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2464157684 To optimize DevLake's performance, consider the following suggestions: 1. **Data Collection Configuration**: - Use the **Time Filter** to limit the data collection to a specific time range. - Adjust the **Frequency** of data synchronization to balance between data freshness and system load. - Enable the **"Skip failed tasks"** option to allow the pipeline to continue running even if some tasks fail, which helps in managing large data volumes efficiently [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Configuration/Tutorial.md). 2. **Parallel Processing**: - Increase the `PIPELINE_MAX_PARALLEL` setting to allow multiple pipelines to run concurrently, enhancing data collection speed [[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.20/Configuration/Tutorial.md). 3. **Resource Allocation**: - Ensure your system has sufficient CPU, memory, and disk resources. A robust multi-core CPU, at least 16GB of RAM, and SSD storage are recommended for handling large datasets. 4. **Data Collection Modes**: - Use **Collect Data (Default)** mode for incremental refreshes where supported. - Opt for **Collect Data in Full Refresh Mode** if you need to re-collect all data within a specific range. - Use **Re-transform Data** to apply new transformation rules without collecting new data [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Configuration/Tutorial.md). 5. **Rate Limits**: - Adjust rate limits for data sources like GitLab and Jira to optimize data collection speed without exceeding API limits. Implementing these strategies can help improve the performance and efficiency of DevLake in handling large datasets. *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2464155042 @dosu Suggestions for Devlake Performance Optimization -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463863625 The cycle and configuration method for storing data in DevLake involves the following steps: 1. **Add Data Connections**: Configure the endpoint and authentication details to connect to the source data. Add data scope, such as Git repositories, issue boards, or CI/CD pipelines, to determine what data should be collected. Optionally, define specific data entities or transformation rules within the data scope. 2. **Collect Data in a Project**: Create a project, associate connections with the project, and select specific data scopes. Set the synchronization policy, which includes specifying the sync frequency, time range, and the skip-on-fail option. Start data collection by choosing the desired mode: "Collect Data (Default)", "Collect Data in Full Refresh Mode", or "Re-transform Data". 3. **Check the Data in Grafana Dashboards**: View the collected data using Grafana Dashboards by clicking on the "Dashboards" button in the top-right corner of the Config UI. The synchronization policy allows you to set a time filter, frequency, and running policy. The "Skip failed tasks" option is enabled by default to prevent data loss during large data collections. You can rerun failed tasks later from the blueprint's detail page [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v1.0/Configuration/Tutorial.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.19/Configuration/Tutorial.md)[[3]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.18/Configuration/Tutorial.md)[[4]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Configuration/Tutorial.md)[[5]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.20/Configuration/Tutorial.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463861932 @dosu The cycle and configuration method of storing data in devlake -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463616752 You can change the configuration of `PIPELINE_MAX_PARALLEL` in DevLake by setting it in the configuration file or as an environment variable. The configuration file is located at `devops/releases/lake-v0.21.0/env.example` [[1]](https://github.com/apache/incubator-devlake/blob/main/devops/releases/lake-v0.21.0/env.example). Make sure the value is a positive integer, as a negative value will cause a panic, and a value of zero will mean no limit on parallel pipelines [[2]](https://github.com/apache/incubator-devlake/blob/main/backend/server/services/pipeline.go). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463616023 @dosu Where can I change the configuration of PIPELLINE_maX_PARALLEL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463599312 I wasn't able to find specific recommended hardware specifications for memory, CPU, and disk space for running Apache DevLake in a production environment within the repository. However, for handling large amounts of data, it's generally advisable to ensure that your system has sufficient resources to accommodate the data processing and storage needs. This typically means having a robust multi-core CPU, ample RAM (at least 16GB or more), and sufficient disk space, preferably on SSDs, to handle the data volume and ensure smooth operation. Adjusting these resources based on your specific data size and processing requirements can help optimize performance. *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463598144 @dosu Recommended memory, CPU, and disk for devlake to collect large amounts of data -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463580131 Yes, DevLake can configure pipeline parallel collection for different projects. To enable or optimize parallel data collection, you need to adjust the `PIPELINE_MAX_PARALLEL` configuration setting. By default, it is set to `1`, which means that only one pipeline will run in parallel. Increasing this value will allow more pipelines to run concurrently, optimizing parallel data collection [[1]](https://github.com/apache/incubator-devlake/blob/main/devops/releases/lake-v1.0.0/env.example). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2463579150 @dosu Can Devlake configure pipeline parallel collection for different projects -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2456568513 To efficiently collect data scopes with a large amount of data in DevLake, consider the following configurations: 1. **Sync Policies**: - **Time Filter**: Select the desired time range for syncing data to optimize the collection process. - **Frequency**: Set the sync frequency or specify a cron code for a custom schedule. - **Running Policy**: Enable "Skip failed tasks" to prevent data loss and allow the pipeline to continue running even if some tasks fail. You can rerun the failed tasks later. 2. **Data Collection Modes**: - Use **Collect Data (Default)** mode for recurring pipelines, which supports incremental refresh for tools and entities that allow it. - Use **Collect Data in Full Refresh Mode** if you need to delete and re-collect all existing data within the designated time range. - **Re-transform Data**: This mode applies the latest transformation rules to existing data without collecting new data. 3. **Configuration Steps**: - **Add Data Connections**: Configure the endpoint and authentication details, and add data scopes like Git repositories or issue boards. - **Create a Project**: Associate connections with the project and set the synchronization policy, including sync frequency, time range, and skip-on-fail options. - **Start Data Collection**: Choose the desired mode for collecting data. These configurations help manage and optimize the data collection process, especially with large datasets [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Configuration/Tutorial.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.20/Configuration/Tutorial.md)[[3]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.19/Configuration/Tutorial.md)[[4]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.18/Configuration/Tutorial.md)[[5]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v1.0/Configuration/Tutorial.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
yu501954162 commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2456144127 Hello@dosu Gitlab and Jira Project have slow data collection speeds, with over 1000 data scopes that can be collected in about 3 days. Is there any configuration that can change Devlake's collection efficiency and data collection time period -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2456144975 To improve data collection efficiency and reduce the data collection time period for GitLab and Jira projects in DevLake, you can adjust the "Fixed Rate Limit" setting for both platforms. For GitLab, you can increase the rate limit to speed up data collection, with a maximum rate limit for GitLab Cloud being 120,000 requests/hour. Ensure you do not exceed the rate limit set by your GitLab instance. Additionally, unselecting certain GitLab entities that you do not need to collect can accelerate the collection speed. You can also set the sync frequency to determine how often data is collected. For Jira, you can adjust the rate limit to increase or lower the speed of data collection. If you encounter a 403 error, lower the rate limit. Contact your Jira Server admin to get or set the maximum rate limit of your Jira instance. Unselecting certain Jira entities that you do not need to collect can also help speed up the process. Configuring the data time range and sync frequency can further optimize data collection [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.17/Configuration/GitLab.md)[[2]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Configuration/Jira.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Question][GItlab] Most GitLab Projects Not Selectable [incubator-devlake]
dosubot[bot] commented on issue #8181: URL: https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2454285183 Hello @yu501954162! I'm here to help you with any bugs, questions, or contributions. Let's tackle this GitLab issue together. Most of your GitLab projects might not be selectable in the "Add Data Scope" section because repositories with guest permissions or those that are archived are excluded. Ensure that your role in the GitLab projects is not set to Guest, as repositories with only guest permissions are not shown in the list. Additionally, check if any of the projects are archived, as they will also be excluded from selection [[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v1.0/Configuration/GitLab.md). *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org