keon94 commented on issue #1681:
URL: 
https://github.com/apache/incubator-devlake/issues/1681#issuecomment-1178402506

   @hezyin 
   You understood correctly, but for (3), we shouldn't need to extract the raw 
epics into _tool_jira_board (as mentioned by @mindlesscloud as well), because 
the regular issues extractor will already have extracted all the board-relevant 
epics to that table. This extractor should ideally only be dealing with epics 
that are not part of the board, but that are parents of issues that do belong 
to that board. Maybe we need a better name for the subtasks, like 
[Foreign/Indirect/Referenced,etc]EpicCollector/Extractor.
   
   As far as your questions:
   
   1. If we simply grab all the epics by IDs (epic-key field 
in_tool_jira_issues ) that's also going to return those epics that belong to 
the board. But these epics will already have been processed by the 
IssueCollector (and later IssueExtractor). We can optimize this so that we only 
get those that are not from the board. We should be able to run a DB Join query 
on _tool_jira_board_issues and _tool_jira_issues to only get the epic IDs of 
those issues whose epic issue-IDs are not in the _tool_jira_board_issues 
(issue_id key) table. This might probably be a complicated query though.
   
   2. If we run the optimization in (1), we know we're dealing with epics that 
are not part of the board. My understanding from this request is that we won't 
be mapping these epics to the boards table. Separate pipelines involving these 
epics' boards will need to be run to get them on that table.
   
   3. The collector will be getting the epics of many issues. Some issues may 
share the same epic. So the list of epic IDs that we get from the query might 
contain duplicates. We should be able to take care of this on the query itself 
though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to