ashokobserve opened a new issue, #8967:
URL: https://github.com/apache/devlake/issues/8967

   ## Summary
   
   DevLake ingests each Jira issue's **current** state (status, 
`resolution_date`, story points) plus whatever changelog it can collect, but it 
does **not** persist Jira's **Sprint Report snapshot** — the frozen record Jira 
takes at sprint close. As a result, per-sprint **committed** and **completed** 
velocity cannot be reproduced accurately from DevLake's domain tables, even 
though the numbers are correct in Jira's own Sprint/Velocity report.
   
   I'd like to propose collecting and persisting the Greenhopper Sprint Report 
so velocity (commitment + completed) is exact and matches Jira's board.
   
   ## Why this matters
   
   - **Velocity is a first-class Scrum metric.** Teams use DevLake dashboards 
to track committed vs completed story points per sprint. Today those numbers 
can only be *approximated*.
   - **Commitment is not reconstructable at all** from current-state + 
changelogs. Commitment is the set of issues (and their SP) *as they stood when 
the sprint started* — a frozen snapshot DevLake never records.
   - **Completed drifts per sprint due to carryover.** Jira attributes an 
issue's completion to the sprint where it was Done **at that sprint's close**. 
Reconstructing this from a single `resolution_date` mis-attributes carryover 
issues (issues living in ≥2 sprints), producing equal-and-opposite errors in 
adjacent sprints.
   
   ## Evidence
   
   We validated a best-effort reconstruction (board-scoped `sprint_issues ∩ 
board_issues`, subtasks excluded, SP from the story-point custom field, 
completed = became-Done within the sprint window) against Jira's Sprint Report 
across 40 actively-used Scrum boards (208 closed sprints):
   
   - **Aggregate accuracy: ~99.7%** (errors cancel across sprints).
   - **Per-sprint accuracy varies sharply with carryover rate:**
     - Low-carryover boards (~0–4% of issues in multiple sprints): **100%** of 
sprints within 5%.
     - High-carryover boards (~25–29%): as low as **2 of 11** sprints within 5%.
   - Root cause confirmed at the issue level: the same handful of carryover 
issues over-count sprint N and under-count sprint N+1, because their single 
`resolution_date` lands in the wrong sprint's window. Jira's Sprint Report 
places them correctly because it is a frozen per-sprint snapshot.
   
   ## How to reproduce the gap
   
   1. Pick a board with meaningful carryover (issues that span >1 sprint).
   2. For two adjacent closed sprints, compute completed SP from DevLake domain 
tables using `resolution_date` within the sprint window.
   3. Compare to Jira's Sprint Report `completedIssuesEstimateSum` for the same 
sprints.
   4. Observe: aggregate over both sprints matches closely, but per-sprint 
values are off by equal-and-opposite amounts for carryover issues.
   
   ## Proposed solution
   
   Add a collector to the Jira plugin that captures the frozen Sprint Report 
per (board, sprint). The endpoint is already frozen and retrievable 
retroactively for any closed sprint, so both **backfill** and **incremental** 
collection are possible:
   
   ```
   GET 
/rest/greenhopper/1.0/rapid/charts/sprintreport?rapidViewId={boardId}&sprintId={sprintId}
   ```
   
   Response buckets to persist:
   - `completedIssues` (+ `completedIssuesEstimateSum`)
   - `issuesNotCompletedInCurrentSprint` (+ estimate sum)
   - `puntedIssues` (removed mid-sprint)
   - `issuesCompletedInAnotherSprint`
   
   Suggested modeling:
   - New tool-layer table `_tool_jira_sprint_reports` keyed by `(connection_id, 
board_id, sprint_id, issue_id)` with a `bucket` enum and the frozen SP value.
   - Domain enrichment: `sprint_issues.is_removed` / a per-(sprint,issue) 
`resolution_bucket`, plus sprint-level `committed_story_points` and 
`completed_story_points`.
   - This makes the standard velocity dashboard exact and removes the need for 
per-sprint reconstruction heuristics.
   
   Notes:
   - The `/velocity` endpoint 
(`/rest/greenhopper/1.0/rapid/charts/velocity?rapidViewId={boardId}`) returns 
per-sprint committed/completed sums directly and is a cheaper option if 
issue-level buckets aren't needed.
   - These are non-public Greenhopper endpoints; the same auth as the existing 
Jira collector works.
   
   Happy to help with a PR if there's interest in this direction.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to