potiuk commented on PR #27854: URL: https://github.com/apache/airflow/pull/27854#issuecomment-1325533144
> Given the difference bugs, I agree we should re-review #25717 but while the changes introduced by #25717 are, to me, a really good step forwards, it is quite complex to understand all the branches and consequences of this change. Thus, I would also strongly encourage running/writing tests with high coverage for all operators which inherits from `SQLExecuteOperator`. We might need some other folks in order to do that What kind of test do you imagine, I am curious @vincbeck ? I would love to hear more because general "tests" might mean many things. IMHO lack of unit test is not a problem - generally speaking the providers are sufficiently covered by unit tests. What I plan to add is to add some "preventive" pre-comits to flag things that we know might be problematic (after we experienced them) and prevent some future, similar errors where someone accidentally uses a method or class of ComonSql that should not be used. I also thought about some way how we can snapshot the API (methods and classes of common.sql) and flag any changes to it and make sure that someone deliberately acknowledges changes to the API and flag it for extra scrutiny. This is for example how Android OS API tests (internally) are done for common parts. Whenever API changes, the tests break and you need to deliberately update the "snapshot" by running some tooling - by doing that the one who does it confirms that they reviewed the change in the API and that consequences of doing so are acknowledged and reviewed. I think there is huge problem to do it with "classic" test approach. Those tests that test if the code "works" rather comparing to tests that flag "potentially hazardous changes in API". Why? When you develop common code like that you need to account for MANY version cross references (basically all the past clients you know but also all the past clients you do not know) and you would have to run ALL released past version of DB providers with the currently "in-progress" version of common sql provider. I think the only way to achieve any kind of "certainty" is AIP-47 style test (system tests) that installs latest common.sql provider. installs one-by-one all past version of all DB providers that can use the common.sql provider being released and run end-to-end tests with real Database behind - covering all the features. Those tests cannot be "Smoke Tests" AIP-47 style - there should fe full end-2-end comprehensive tests using released DB provider packages and "to-be-released" common sql package. This means that somoene has to develop and maintain comprehensive set of System tests (AIP-47) that will be dynamice enougj to find relevant tests for all past versions of all DB providers and run them with all past released versions of those providers. If we have 10 DB providers, each of them has 10 past releases, it means that every time common.sql provider is changed, you need to run 10x10 = 100 times comprehensive AIP-47 suite of tests. This is IMHO extreamely huge effort - mainly because of stability and sustainability. you know how AIP-47 tests are notoriously difficult to keep runnin. Realistically speaking, such suite of tests will be in a constant failure state and it will continue to be broken and someone will have to continuously manage and fix all such tests and it will not happen, because no-one will feel any need/incentive to keep it running and it will be abandoned very quickly (even if we make HUGE effort to have the tests). (note that the current AIP-47 for Amazon for example are really "smoke" tests. Any suite to prevent similar problem for common.sql would have to test many more cases "per operator" and will be much more "instable". It would also take enormous time to run those tests and it would require to run them with a real DB - in some cases just setting up a DB for test is a constant struggle. My experience tells me that investing in such tests is a huge and more often than not lost effort - usually costs connected with it are many times more than the gain you have by having such tests. But maybe there are other experiences :) ? Or maybe you think about other kinds of tests that could prevent this kind of errors? I'd love to hear your thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
