dosubot[bot] commented on issue #7524:
URL: 
https://github.com/apache/incubator-devlake/issues/7524#issuecomment-2133601536

   <!-- Greeting -->
   Hello @claudious05! I'm here to help you with your issue. Feel free to ask 
me anything.
   
   <!-- Answer -->
   To resolve the "plain clone git error" in the gitextractor job when 
collecting data from an on-premise GitLab using DevLake (v0.21.0@ac4f7fe) with 
Docker Compose, you can follow these steps:
   
   1. **Check the Proxy Configuration**: Ensure that the proxy settings are 
correctly configured if you are using a proxy. The code supports both 
HTTP/HTTPS and SSH proxies.
   
   2. **Verify the Private Key**: If you are using SSH with a private key, make 
sure the private key is correctly handled and the passphrase (if any) is 
correctly set.
   
   3. **Disable Shallow Clone**: If the error persists, try disabling the 
shallow clone by setting `NoShallowClone` to `true` in the `CloneRepoConfig`.
   
   Here is an example of how you can modify the `CloneRepoConfig` to disable 
shallow cloning:
   
   ```go
   type CloneRepoConfig struct {
       UseGoGit        *bool
       SkipCommitStat  *bool
       SkipCommitFiles *bool
       NoShallowClone  bool
   }
   
   // Example usage
   config := CloneRepoConfig{
       NoShallowClone: true,
   }
   ```
   
   4. **Check SSL Verification**: If you are using HTTPS and facing SSL 
verification issues, you can disable SSL verification by setting the 
`IN_SECURE_SKIP_VERIFY` configuration to `true`.
   
   5. **Repack and Deepen Commits**: The code already includes a mechanism to 
repack and deepen commits to avoid certain errors. Ensure this part of the code 
is executed correctly.
   
   Here is the relevant part of the code for repacking and deepening commits:
   
   ```go
   // deepen the commits by 1 more step to avoid 
https://github.com/apache/incubator-devlake/issues/7426
   if since != nil {
       // fixes error described on 
https://stackoverflow.com/questions/63878612/git-fatal-error-in-object-unshallow-sha-1
       // It might be caused by the commit which being deepen has multiple 
parents (e.g., a merge commit), not sure.
       repackCmd := exec.CommandContext(ctx.GetContext(), "git", "-C", 
localDir, "repack", "-d")
       if err := repackCmd.Run(); err != nil {
           return errors.Default.Wrap(err, "failed to repack the repo")
       }
       deepenCmd := exec.CommandContext(ctx.GetContext(), "git", "-C", 
localDir, "fetch", "--deepen=1")
       if err := deepenCmd.Run(); err != nil {
           return errors.Default.Wrap(err, "failed to deepen the cloned repo")
       }
   }
   ```
   
   6. **Ensure Proper Cleanup**: The `CloneGitRepo` function includes a cleanup 
callback to remove the cloned directory after the operation. Make sure this 
cleanup is properly set to avoid any residual issues.
   
   Here is the relevant part of the `CloneGitRepo` function:
   
   ```go
   func CloneGitRepo(subTaskCtx plugin.SubTaskContext) errors.Error {
       taskData, ok := subTaskCtx.GetData().(*parser.GitExtractorTaskData)
       if !ok {
           panic("git repo reference not found on context")
       }
       op := taskData.Options
       storage := store.NewDatabase(subTaskCtx, op.RepoId)
       var err errors.Error
       logger := subTaskCtx.GetLogger()
   
       // temporary dir for cloning
       localDir, e := os.MkdirTemp("", "gitextractor")
       if e != nil {
           return errors.Convert(e)
       }
   
       // clone repo
       repoCloner := parser.NewGitcliCloner(subTaskCtx)
       err = repoCloner.CloneRepo(subTaskCtx, localDir)
       if err != nil {
           if errors.Is(err, parser.ErrNoData) {
               taskData.SkipAllSubtasks = true
               return nil
           }
           return err
       }
   
       // We have done comparison experiments for git2go and go-git, and the 
results show that git2go has better performance.
       var repoCollector parser.RepoCollector
       if *taskData.Options.UseGoGit {
           repoCollector, err = parser.NewGogitRepoCollector(localDir, 
op.RepoId, storage, logger)
       } else {
           repoCollector, err = parser.NewLibgit2RepoCollector(localDir, 
op.RepoId, storage, logger)
       }
       if err != nil {
           return err
       }
   
       // inject clean up callback to remove the cloned dir
       cleanup := func() {
           _ = os.RemoveAll(localDir)
       }
       if e := repoCollector.SetCleanUp(cleanup); e != nil {
           return errors.Convert(e)
       }
   
       // pass the collector down to next subtask
       taskData.GitRepo = repoCollector
       subTaskCtx.TaskContext().SetData(taskData)
       return nil
   }
   ```
   
   By following these steps, you should be able to resolve the "plain clone git 
error" in the gitextractor job 
<sup>[[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/gitextractor/parser/clone_gitcli.go)</sup><sup>[[2]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/gitextractor/tasks/repo_cloner.go)</sup>.
   
   
   <!-- Dosu Comment Footer -->
   <details open>
   <summary>
   
   #### Details
   </summary>
   
   **About Dosu**
   This response is meant to be useful and save you time. It is **not meant to 
be a precise solution**, but rather a starting point for your own research.
   
   Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is 
irrelevant.
   </details>
   
   *To continue the conversation, mention @dosu.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to