dosubot[bot] commented on issue #8818:
URL: 
https://github.com/apache/incubator-devlake/issues/8818#issuecomment-4153152526

   <!-- Answer -->
   This is a great question — here's what's happening and what you can do about 
it.
   
   **Why normal collection misses commits:** In incremental mode, the 
`gitextractor` plugin performs a **shallow git clone** (`git fetch 
--shallow-since=&lt;timeAfter&gt;`) instead of a full clone 
[[1]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/gitextractor/parser/clone_gitcli.go).
 Commits whose **parent commits fall outside the `timeAfter` window** are 
intentionally skipped to avoid writing incorrect diff statistics 
[[2]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/gitextractor/parser/repo_gogit.go).
 This means boundary commits get dropped from the `repo_commits` and 
`commit_parents` tables entirely. This behavior was reported in [Issue 
#8434](https://github.com/apache/incubator-devlake/issues/8434) 
[[3]](https://github.com/apache/incubator-devlake/issues/8434#issuecomment-2863461586).
   
   **Why Full Refresh Mode works:** When `FullSync = true`, DevLake performs a 
complete `git clone --bare` with no shallow clone or time boundary, so all 
commits and their parents are available 
[[4]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/helpers/pluginhelper/api/subtask_state_manager.go).
   
   **Configuring scheduled tasks for Full Refresh:** Unfortunately, **there is 
currently no built-in way to make a cron-scheduled blueprint always run in Full 
Refresh Mode**. The `FullSync` flag is a trigger-time override only, not a 
persistent blueprint setting 
[[5]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/core/models/blueprint.go).
 Your options are:
   
   1. **Trigger via API each time** with Full Refresh:
   
      ```
      POST /api/blueprints/{blueprintId}/trigger
      {"skipCollectors": false, "fullSync": true}
      ```
   
      You could automate this with an external cron job calling this API 
endpoint 
[[6]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/server/services/blueprint.go).
   
   2. **Extend `timeAfter` generously** — set it far enough in the past that 
parent commits of all relevant commits fall within the window. The maintainer 
suggested *"increasing the time range"* as a workaround 
[[3]](https://github.com/apache/incubator-devlake/issues/8434#issuecomment-2863461586).
   
   3. **Periodically update `timeAfter`** via the UI to keep the collection 
window appropriate for your needs 
[[7]](https://github.com/apache/incubator-devlake/issues/8467#issuecomment-2976273817).
   
   Note: Be aware that `timeAfter` is stored as a **fixed timestamp** from when 
the blueprint was configured — it does not automatically slide forward on cron 
runs 
[[7]](https://github.com/apache/incubator-devlake/issues/8467#issuecomment-2976273817).
 Also, avoid setting `timeAfter` to `null` via the PATCH API, as there's an 
open bug ([#8516](https://github.com/apache/incubator-devlake/issues/8516)) 
where it gets stored as `0001-01-01` instead 
[[8]](https://github.com/apache/incubator-devlake/issues/8516).
   
   
   <!-- Dosu Comment Footer -->
   *To reply, just mention 
[@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=brand-link).*
   
   ---
   <sup>How did I do? 
[Good](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=great_response&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-great_response)
 | 
[Irrelevant](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=irrelevant_answer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-irrelevant_answer)
 | 
[Incorrect](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=incorrect_sources&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-incorrect_sources)
 | 
[Verbose](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=too_verbose&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-too_verbose)
 | [Hallucination](https://app.dosu.dev/response-feedback/b1bc5491-942b
 
-449c-9a5a-5f65c461ad46?feedback_type=hallucination&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-hallucination)
 | [Report 
🐛](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=bug_report&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-bug_report)
 | 
[Other](https://app.dosu.dev/response-feedback/b1bc5491-942b-449c-9a5a-5f65c461ad46?feedback_type=other&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-other)</sup>
   
   [![Chat with 
Dosu](https://dosu.dev/dosu-chat-badge.svg)](https://app.dosu.dev/b4e8e847-d479-4541-83a8-d88d83fea5c9/ask?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=chat-badge)&nbsp;[![Open
 in 
Cursor](https://dosu.dev/dosu-in-cursor.svg)](https://cursor.com/link/prompt?text=This%20is%20a%20great%20question%20%E2%80%94%20here%27s%20what%27s%20happening%20and%20what%20you%20can%20do%20about%20it.%0A%0A%2A%2AWhy%20normal%20collection%20misses%20commits%3A%2A%2A%20In%20incremental%20mode%2C%20the%20%60gitextractor%60%20plugin%20performs%20a%20%2A%2Ashallow%20git%20clone%2A%2A%20%28%60git%20fetch%20--shallow-since%3D%26lt%3BtimeAfter%26gt%3B%60%29%20instead%20of%20a%20full%20clone%20%5B%5B1%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/gitextractor/parser/clone_gitcli.go%29.%20Commits%20whose%20%2A%2Aparent%20commits%20fall%20outside%20the%20%60timeAfter%60%20window%2A%2A%20are%2
 
0intentionally%20skipped%20to%20avoid%20writing%20incorrect%20diff%20statistics%20%5B%5B2%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/gitextractor/parser/repo_gogit.go%29.%20This%20means%20boundary%20commits%20get%20dropped%20from%20the%20%60repo_commits%60%20and%20%60commit_parents%60%20tables%20entirely.%20This%20behavior%20was%20reported%20in%20%5BIssue%20%238434%5D%28https%3A//github.com/apache/incubator-devlake/issues/8434%29%20%5B%5B3%5D%5D%28https%3A//github.com/apache/incubator-devlake/issues/8434%23issuecomment-2863461586%29.%0A%0A%2A%2AWhy%20Full%20Refresh%20Mode%20works%3A%2A%2A%20When%20%60FullSync%20%3D%20true%60%2C%20DevLake%20performs%20a%20complete%20%60git%20clone%20--bare%60%20with%20no%20shallow%20clone%20or%20time%20boundary%2C%20so%20all%20commits%20and%20their%20parents%20are%20available%20%5B%5B4%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf47
 
45b8b/backend/helpers/pluginhelper/api/subtask_state_manager.go%29.%0A%0A%2A%2AConfiguring%20scheduled%20tasks%20for%20Full%20Refresh%3A%2A%2A%20Unfortunately%2C%20%2A%2Athere%20is%20currently%20no%20built-in%20way%20to%20make%20a%20cron-scheduled%20blueprint%20always%20run%20in%20Full%20Refresh%20Mode%2A%2A.%20The%20%60FullSync%60%20flag%20is%20a%20trigger-time%20override%20only%2C%20not%20a%20persistent%20blueprint%20setting%20%5B%5B5%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/core/models/blueprint.go%29.%20Your%20options%20are%3A%0A%0A1.%20%2A%2ATrigger%20via%20API%20each%20time%2A%2A%20with%20Full%20Refresh%3A%0A%0A%20%20%20%60%60%60%0A%20%20%20POST%20/api/blueprints/%7BblueprintId%7D/trigger%0A%20%20%20%7B%22skipCollectors%22%3A%20false%2C%20%22fullSync%22%3A%20true%7D%0A%20%20%20%60%60%60%0A%0A%20%20%20You%20could%20automate%20this%20with%20an%20external%20cron%20job%20calling%20this%20API%20endpoint%20%5B%5B6%5D
 
%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/server/services/blueprint.go%29.%0A%0A2.%20%2A%2AExtend%20%60timeAfter%60%20generously%2A%2A%20%E2%80%94%20set%20it%20far%20enough%20in%20the%20past%20that%20parent%20commits%20of%20all%20relevant%20commits%20fall%20within%20the%20window.%20The%20maintainer%20suggested%20%2A%22increasing%20the%20time%20range%22%2A%20as%20a%20workaround%20%5B%5B3%5D%5D%28https%3A//github.com/apache/incubator-devlake/issues/8434%23issuecomment-2863461586%29.%0A%0A3.%20%2A%2APeriodically%20update%20%60timeAfter%60%2A%2A%20via%20the%20UI%20to%20keep%20the%20collection%20window%20appropriate%20for%20your%20needs%20%5B%5B7%5D%5D%28https%3A//github.com/apache/incubator-devlake/issues/8467%23issuecomment-2976273817%29.%0A%0ANote%3A%20Be%20aware%20that%20%60timeAfter%60%20is%20stored%20as%20a%20%2A%2Afixed%20timestamp%2A%2A%20from%20when%20the%20blueprint%20was%20configured%20%E2%80%94%20it%20does%20not%
 
20automatically%20slide%20forward%20on%20cron%20runs%20%5B%5B7%5D%5D%28https%3A//github.com/apache/incubator-devlake/issues/8467%23issuecomment-2976273817%29.%20Also%2C%20avoid%20setting%20%60timeAfter%60%20to%20%60null%60%20via%20the%20PATCH%20API%2C%20as%20there%27s%20an%20open%20bug%20%28%5B%238516%5D%28https%3A//github.com/apache/incubator-devlake/issues/8516%29%29%20where%20it%20gets%20stored%20as%20%600001-01-01%60%20instead%20%5B%5B8%5D%5D%28https%3A//github.com/apache/incubator-devlake/issues/8516%29.)&nbsp;[![Join
 
Discord](https://img.shields.io/badge/join-5865F2?logo=discord&logoColor=white&label=)](https://go.dosu.dev/discord-bot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=join-discord)&nbsp;[![Share
 on 
X](https://img.shields.io/badge/X-share-black)](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/incubator-devlake/issues/8818)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to