potiuk commented on PR #37152:
URL: https://github.com/apache/airflow/pull/37152#issuecomment-1968525104

   This is nice. 
   
   I like the idea that we are focusing on coverage of particular test files - 
because since we are doing selective tests, this is the only way to get it 
somewhat close to full coverage. It's not ideal - because sometimes for example 
- we will get coverage in API by running a different test, but this is actually 
a nice way to find tests that are put in a wrong folder and we can figure out 
that one later.
   
   There are however a few small problems I see (but I thin we can figure out 
how to solve them):
   
   * Some files have now split between DB tests and non-DB tests - which means 
that the "full" coverage we will get only after we get both test. This is quite 
often situatoin to be honest. 
   
   This is is solvable - by producing coverage files, uploading them as 
artefacts and then have a separate job which will download all the coverage 
files, combine them and only after that will report success/failure. Most of 
the code of your will woirk - but it could be simply separate job run after 
everything we've done. We do a very similar thing now already for "summarize 
warning" - it is done for the very same reason. All the tests are producing 
warning files and then we have a single job that pulls all of them, 
deduplicates and summarizes them.
   
   This also has a very nice effect - individual tests will not be failing. 
This is scary to see your tests failing - you need to take a look at the output 
to find out that this is coverage, rather than actual test failure. Having a 
aseparate job is way better - because that job can be name "Coverage Check" or 
similar and when it fails, you will immediately know that it's the coverage 
that failed.
   
   Plus we can make that single job non-fatal - i..e. failure in coverage will 
not  mean that the whole job is failed (we do that for Quarantine tests) - at 
least initially when we will be figuring out all quiirks and problem, we can 
accept the job faiiling for a while until we find and solve all the 
imperfectness we will see from people running actual PRs. Or initially - for 
testing we could run it without failing at all  - just having a summary and 
have a look and see how it works for various PRs. Then gradually we could make 
it fail (but non -fatal) and eventually make it a "must" to merge PR to have it 
green, once we are sure that there are very few false negatives.
   
   * The information that the coverage dropped without giving an easy way how 
to check it is not very helpful for PR author
   
   I think it woud be good (and this is not a very complex thing) to have a way 
to show the diff of coverage for the files that have problem: before / after. I 
think that can be done nicely with generating right URL in the remote codecov - 
when we upload the coverage from the test, there is a way to see it and even 
see the diff I think. Seeing just "coverage dropped" and not giving the user a 
clue how to check it is bad - seeing URL will help. Also there is this GitHub 
Action from Codecov which we could use. 
   
   I think such approach might be way better than what the Github Action 
integration provides by default with their actions they post some comments in 
PR with summary etc. but I find it far too spammy - having a separate JOB with 
"code coverage" information, indication of problem (by status) and  list of 
files with problems with an URL to follow is a way better thing. 
   
   * Fulll vs. expected coverage
   
   I think eventually rather than full coverage, we could turn it into coverage 
drop and this way we could increase the number of files we could monitor 
easily. But that's another story.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to