qpawelc opened a new issue, #8249:
URL: https://github.com/apache/incubator-devlake/issues/8249

   <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
   -->
   
   ## Question
   
   Hey all!
   
   I am using Devlake `v1.0.1` and I am currently ingesting the commits from a 
project with the Gitlab integration, and then ingesting the deployments for the 
project via webhook (from Spinnaker). I do not have the all-time deployment 
data for my project, only the last couple of months. After I run the collect 
data job, I noticed that in the `project_pr_metrics` table, all of the commits 
for the project that occurred before I onboarded to Spinnaker seem to get 
associated with a "seemingly random" deployment.
   
   It may be easier to visualize this. Take a look at this screenshot. The 
bottom blue dots represent commits. The top dots represent deployments. The 
lines represent the commit-to-deployment relationship in the 
`project_pr_metrics` table. As you can see, all of my commits that occurred 
before the beginning of the my deployment data seem to ALL become associated 
with a single deployment, causing certain metrics to be very skewed.
   
   Does anyone have any advice on how I can:
   
   - Debug why devlake chooses to associated these commits to this deploy? For 
example, why wouldnt it be the first deploy?
   - Work around this issue? One idea I have is to set the blueprint for each 
project only to ingest data after the second successful production deployment.
   
   Thanks for your time and support!
   
   ## Screenshots
   
![image](https://github.com/user-attachments/assets/b6b0eb2c-007d-40e4-afa9-20cf471b9ee7)
   
   ## Additional context
   
   Its worth noting that I have found a similar 
[issue](https://github.com/apache/incubator-devlake/issues/7193). In this 
[comment](https://github.com/apache/incubator-devlake/issues/7193#issuecomment-2085140726)
 @nicolavolpini mentioned that "still shows several PRs associated to the same 
deployment webhook" but no additional context was posted on how this issue 
could be resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to