kellyseattle opened a new issue, #11670:
URL: https://github.com/apache/tvm/issues/11670

   Is there a way to verify that a PR hasn't resulted in an increased number of 
skipped tests?  The "Tests" link in the Jenkins report shows skipped tests, but 
it looks like that currently includes all tests that are run on other shards, 
making it very difficult to tell what tests were actually skipped.  (Example, 
[Tests tab for 
PR#11313](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-11313/7/tests/),
 which introduced the bug Mehrdad found, which lists 238k skipped tests, most 
of which were run on a different shard.)
   
   Brainstorming on possible steps:
   1. When running pytest_collection_modifyitems, if a test is running on 
another shard, remove it from items entirely instead of...Downside: If a test 
isn't assigned to any shard, it wouldn't show up in any report as being skipped.
   2. When generating the "Tests" tab in the Jenkins report, only show a test 
as being skipped if it was skipped on all shards.
   3. If the number of skipped tests has changed relative to the Jenkins report 
for the previous commit on main, highlight in the PR.
   Would require additional github/Jenkins interactions, uncertain how 
difficult that would be.
   
   Comments
   - we're working on a bot to comment a link to the docs on a PR. i think 
maybe we should also add in the # of added/removed/skipped tests somewhere, or 
perhaps see if Jenkins can know this (edited) 
   - jenkins has some notion of test stats already, it tracks passed tests, 
failed tests, skipped tests (and ‘fixed tests’: those that were broken but are 
now passing, and ‘regressed tests’ which is the opposite), we could query for 
these in the bot for it + the base commit and post them on the PR
   - option 1 of Eric’s is the cleanest and easiest and I’ve never seen any 
problems with the sharding but it could silently break in the future
   - we’ll probably have to do something like option 2 eventually (i.e. post 
process the junit xml before sending to Jenkins)
   - would looking at just the number of tests ran (i.e. passed + failed tests) 
carry the same signal?
   - we could also run collection one time and produce a count or master list 
and then fail if each test isn't accounted for
   - there's also https://pypi.org/project/pytest-shard/
   - looks like they go the ‘remove from items ’ route: 
https://github.com/AdamGleave/pytest-shard/blob/master/pytest_shard/pytest_shard.py#L49-L53
   - added a graph of passed/failed tests to the [main ci 
dashboard](https://monitoring.tlcpack.ai/d/hB2FYR87k/build-status?orgId=1), the 
change is pretty clear
   - Thank you, and that definitely helps!  I think the passed+failed tests 
graph is the best visual.  The "skipped" tests has the advantage of being near 
zero, and therefore being noticed if/when it jumps much higher.  Though, I'm 
not sure what the number of skipped tests for other reasons (e.g. no GPU on CPU 
nodes) is, so maybe it would be much farther from zero than I had been 
imagining.
   - we can always download a junit XML from a recent build and check it 
manually
   - test coverage from pytest or l/gcov for C++ would also tell us about these 
kinds of things but coverage comes with its own bag of worms
   - e could also have like a lint step (not sure if it coudl run in lint) that 
compares the set of skipped tests to an include list of regexes or a master 
"skippable test list"
   
   CC: @hpanda-naut 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to