potiuk commented on PR #37152: URL: https://github.com/apache/airflow/pull/37152#issuecomment-1968525104
This is nice. I like the idea that we are focusing on coverage of particular test files - because since we are doing selective tests, this is the only way to get it somewhat close to full coverage. It's not ideal - because sometimes for example - we will get coverage in API by running a different test, but this is actually a nice way to find tests that are put in a wrong folder and we can figure out that one later. There are however a few small problems I see (but I thin we can figure out how to solve them): * Some files have now split between DB tests and non-DB tests - which means that the "full" coverage we will get only after we get both test. This is quite often situatoin to be honest. This is is solvable - by producing coverage files, uploading them as artefacts and then have a separate job which will download all the coverage files, combine them and only after that will report success/failure. Most of the code of your will woirk - but it could be simply separate job run after everything we've done. We do a very similar thing now already for "summarize warning" - it is done for the very same reason. All the tests are producing warning files and then we have a single job that pulls all of them, deduplicates and summarizes them. This also has a very nice effect - individual tests will not be failing. This is scary to see your tests failing - you need to take a look at the output to find out that this is coverage, rather than actual test failure. Having a aseparate job is way better - because that job can be name "Coverage Check" or similar and when it fails, you will immediately know that it's the coverage that failed. Plus we can make that single job non-fatal - i..e. failure in coverage will not mean that the whole job is failed (we do that for Quarantine tests) - at least initially when we will be figuring out all quiirks and problem, we can accept the job faiiling for a while until we find and solve all the imperfectness we will see from people running actual PRs. Or initially - for testing we could run it without failing at all - just having a summary and have a look and see how it works for various PRs. Then gradually we could make it fail (but non -fatal) and eventually make it a "must" to merge PR to have it green, once we are sure that there are very few false negatives. * The information that the coverage dropped without giving an easy way how to check it is not very helpful for PR author I think it woud be good (and this is not a very complex thing) to have a way to show the diff of coverage for the files that have problem: before / after. I think that can be done nicely with generating right URL in the remote codecov - when we upload the coverage from the test, there is a way to see it and even see the diff I think. Seeing just "coverage dropped" and not giving the user a clue how to check it is bad - seeing URL will help. Also there is this GitHub Action from Codecov which we could use. I think such approach might be way better than what the Github Action integration provides by default with their actions they post some comments in PR with summary etc. but I find it far too spammy - having a separate JOB with "code coverage" information, indication of problem (by status) and list of files with problems with an URL to follow is a way better thing. * Fulll vs. expected coverage I think eventually rather than full coverage, we could turn it into coverage drop and this way we could increase the number of files we could monitor easily. But that's another story. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
