dealexce opened a new issue, #2650:
URL: https://github.com/apache/incubator-devlake/issues/2650

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   Hello everyone. I am using the latest main branch version and trying to 
collect gitlab commit data with Devlake. However, I found that sometimes some 
commits are not collected into the database after the pipeline run. 
Specifically, I have tested on 3 projects, each's outcome:
   
   1. all commits on 28 branches/tags are stored successfully
   2. only the 7th latest commit (sha:7d7600..) on dev branch is missing. The 
commits on master branch are all stored
   3. same as the second project, only the 7th latest commit on dev branch is 
missing
   
   I'm not sure whether it is related to '7th latest commit' or the branches 
with name 'dev' or something else. But these are some things I tried and 
observed:
   
   1. The first project also has a branch named 'dev' and all commits seems to 
be collected well
   2. I submit a new commit on the second project and the 7d7600 commit (which 
is the 8th latest now) is still missing, and other commits are all collected.
   
   This is how I query the commits on a specific branch:
   `with res as (
   SELECT
     @r AS _sha,
     (SELECT @r := parent_commit_sha FROM commit_parents WHERE commit_sha = 
_sha limit 1) AS parent_sha,
     @l := @l + 1 AS lvl 
   FROM (SELECT @r := (SELECT commit_sha FROM refs WHERE 
id='gitlab:GitlabProject:1:175:origin/dev'), @l := 0) vars,
   commit_parents h) 
   SELECT _sha, parent_sha FROM res WHERE parent_sha IS NOT NULL;`
   
   This is what I got:
   
![image](https://user-images.githubusercontent.com/50829041/182084026-e5665cdc-bc38-4cce-9122-b466af20cef5.png)
   
   It shows that there are only 7 commits on this branch, and the commit 
`7d7600..` are not found in `commit_parents` table so it is the end. However, 
the 7d7600 is also not found in `commits` table,
   
![image](https://user-images.githubusercontent.com/50829041/182084773-a4817f08-1e39-4bb9-ae83-edf89fc5679b.png)
   
   and the gitlab shows that there are more commits before 7d7600:
   
![image](https://user-images.githubusercontent.com/50829041/182084360-d175b32d-8804-4809-9606-7b8867fb2c1a.png)
   
   Moreover, I check the sha of the parent commit of 7d7600 from the gitlab 
(acc33d..), and I found that the commits after 7d7600 are all stored well in 
the tables:
   
![image](https://user-images.githubusercontent.com/50829041/182085493-69b4734a-421e-4685-9248-cc037c62eaae.png)
   ...
   
![image](https://user-images.githubusercontent.com/50829041/182085511-1cf78e4e-59c7-4ad7-bebd-81241c5473e7.png)
   
   So I think this commit is probably missing during the collection some how. 
This is the logs of the pipeline run for the second project: 
   [[task #18] 
[gitextractor].log](https://github.com/apache/incubator-devlake/files/9231509/task.18.gitextractor.log)
   
   I might did something wrong on this, thanks a lot for any ideas.
   
   ### What you expected to happen
   
   All commits on all branches of the project should be collected and stored.
   
   ### How to reproduce
   
   Use the latest devlake, create a gitlab blueprint and run. 
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   main
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to