[GitHub] [airflow] potiuk commented on pull request #27854: Fix errors in Databricks SQL operator introduced when refactoring

GitBox Wed, 23 Nov 2022 11:02:25 -0800


potiuk commented on PR #27854:
URL: https://github.com/apache/airflow/pull/27854#issuecomment-1325533144


   > Given the difference bugs, I agree we should re-review #25717 but while 
the changes introduced by #25717 are, to me, a really good step forwards, it is 
quite complex to understand all the branches and consequences of this change. 
Thus, I would also strongly encourage running/writing tests with high coverage 
for all operators which inherits from `SQLExecuteOperator`. We might need some 
other folks in order to do that
   
   What kind of test do you imagine, I am curious @vincbeck ? I would love to 
hear more because general "tests" might mean many things.
   
   IMHO lack of unit test is not a problem - generally speaking the providers 
are sufficiently covered by unit tests. 
   
   What I plan to add is to add some "preventive" pre-comits to flag things 
that we know might be problematic (after we experienced them) and prevent some 
future, similar errors where someone accidentally uses a method or class of 
ComonSql that should not be used.
   
   I also thought about some way how we can snapshot the API (methods and 
classes of common.sql) and flag any changes to it and make sure that someone 
deliberately acknowledges changes to the API and flag it for extra scrutiny.  
This is for example how Android OS API tests (internally) are done for common 
parts. Whenever API changes, the tests break and you need to deliberately 
update the "snapshot" by running some tooling - by doing that the one who does 
it confirms that they reviewed the change in the API and that consequences of 
doing so are acknowledged and reviewed.
   
   I think there is huge problem to do it with "classic" test approach. Those 
tests that test if the code "works" rather comparing to tests that flag 
"potentially hazardous changes in API". Why?
   
   When you develop common code like that you need to account for MANY version 
cross references (basically all the past clients you know but also all the past 
clients you do not know) and you would have to run ALL released past version of 
DB providers with the currently "in-progress" version of common sql provider.
   
   I think the only way to achieve any kind of "certainty" is AIP-47 style test 
(system tests) that installs latest common.sql provider. installs one-by-one 
all past version of all DB providers that can use the common.sql provider being 
released and run end-to-end tests with real Database behind - covering all the 
features.  Those tests cannot be "Smoke Tests" AIP-47 style - there should fe 
full end-2-end comprehensive tests using released DB provider packages and 
"to-be-released" common sql package.
   
   This means that somoene has to develop and maintain comprehensive set of 
System tests (AIP-47) that will be dynamice enougj to find relevant tests for 
all past versions of all DB providers and run them with all past released 
versions of those providers. If we have 10 DB providers, each of them has 10 
past releases, it means that every time common.sql provider is changed, you 
need to run 10x10 = 100 times comprehensive AIP-47 suite of tests. 
   
   This is IMHO extreamely huge effort - mainly because of stability and 
sustainability. you know how AIP-47 tests are notoriously difficult to keep 
runnin. Realistically speaking, such suite of tests will be in a constant 
failure state and it will continue to be broken and someone will have to 
continuously manage and fix all such tests and it will not happen, because 
no-one will feel any need/incentive to keep it running and it will be abandoned 
very quickly (even if we make HUGE effort to have the tests). (note that the 
current AIP-47 for Amazon for example are really "smoke" tests. Any suite to 
prevent similar problem for common.sql would have to test many more cases "per 
operator" and will be much more "instable". It would also take enormous time to 
run those tests and it would require to run them with a real DB - in some cases 
just setting up a DB for test is a constant struggle. My experience tells me 
that investing in such tests is a huge and more often than not lost effort - 
 usually costs connected with it are many times more than the gain you have by 
having such tests.
   
   But maybe there are other experiences :) ? Or maybe you think about other 
kinds of tests that could prevent this kind of errors? I'd love to hear your 
thoughts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] potiuk commented on pull request #27854: Fix errors in Databricks SQL operator introduced when refactoring

Reply via email to