dosubot[bot] commented on issue #8181:
URL: 
https://github.com/apache/incubator-devlake/issues/8181#issuecomment-2456568513

   <!-- Answer -->
   To efficiently collect data scopes with a large amount of data in DevLake, 
consider the following configurations:
   
   1. **Sync Policies**:
      - **Time Filter**: Select the desired time range for syncing data to 
optimize the collection process.
      - **Frequency**: Set the sync frequency or specify a cron code for a 
custom schedule.
      - **Running Policy**: Enable "Skip failed tasks" to prevent data loss and 
allow the pipeline to continue running even if some tasks fail. You can rerun 
the failed tasks later.
   
   2. **Data Collection Modes**:
      - Use **Collect Data (Default)** mode for recurring pipelines, which 
supports incremental refresh for tools and entities that allow it.
      - Use **Collect Data in Full Refresh Mode** if you need to delete and 
re-collect all existing data within the designated time range.
      - **Re-transform Data**: This mode applies the latest transformation 
rules to existing data without collecting new data.
   
   3. **Configuration Steps**:
      - **Add Data Connections**: Configure the endpoint and authentication 
details, and add data scopes like Git repositories or issue boards.
      - **Create a Project**: Associate connections with the project and set 
the synchronization policy, including sync frequency, time range, and 
skip-on-fail options.
      - **Start Data Collection**: Choose the desired mode for collecting data.
   
   These configurations help manage and optimize the data collection process, 
especially with large datasets 
<sup>[[1]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.21/Configuration/Tutorial.md)</sup><sup>[[2]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.20/Configuration/Tutorial.md)</sup><sup>[[3]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.19/Configuration/Tutorial.md)</sup><sup>[[4]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v0.18/Configuration/Tutorial.md)</sup><sup>[[5]](https://github.com/apache/incubator-devlake-website/blob/main/versioned_docs/version-v1.0/Configuration/Tutorial.md)</sup>.
   
   
   <!-- Dosu Comment Footer -->
   *To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to