u99127 commented on pull request #8361: URL: https://github.com/apache/tvm/pull/8361#issuecomment-872585551
> FYI there is also https://github.com/apache/tvm/blob/main/tests/python/topi/python/test_topi_conv2d_nchw.py for testing `NCHW` layout that already runs on all available targets. So if we run it where Arm CPU is enabled it will test it automatically. > > The only problem could be that it is a bit long-running as it has more than 350 test cases but we could think of creating a fast running mode too. Got it - yes I could just reproduce the failure with that test but as you say I did have to add 'llvm -device=arm_cpu' to the devices list as you said to see the AssertionError failure, so may be the solution here is that we add 'llvm -device=arm_cpu' to the test here or actually fix the test to only use 'llvm -device=arm_cpu' when running on ci_arm rather than having it run twice on the GPU CI which is where TOPI testing is turned on. I suspect the following things need to happen here to progress this with the tests. 1. We run the topi tests in Jenkins using the Task script and just turn that on for the CI on AArch64. That is a one liner change and hopefully it works. However while pipe cleaning that on my local machine , I've hit a few testisms in the topi tests that need to be cleaned up. Especially with some tests being suitable only for CUDA but not being marked as so. I can help with this. That's pretty easy and straightforward but will need buy in from the project that running these tests in AArch64 CI is a good thing ! I think this bug just shows the value of running these tests in CI. I would prefer to start with the entirety of the TOPI tests, it appears to take me about 30-40 minutes on a machine that has similar characteristics to the one in CI. 2. I agree with you that we should look to add to this PR an additional change to the test script you mention by adding 'llvm -device=arm_cpu' to the list of targets only if one is running the test on AArch64 hardware, otherwise when we come around to doing 1 above we'd be running this test once for the llvm target and once for an llvm -device=arm_cpu and just duplicating testing for a path that is unneeded . For bonus points we should audit the topi test directory and clean it up as we go along because I note that many tests again run for both llvm and llvm -device=arm_cpu which probably just uses more runtime during CI when it is not needed. regards Ramana -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
