potiuk commented on PR #24058:
URL: https://github.com/apache/airflow/pull/24058#issuecomment-1144780452

   I love the way it is implemented @ferruzzi :). I think **maybe** over time 
we will generalize some of that to other providers, but I think @bhirsz 
explained very well about some of the problems we had when we tried to 
generalize too early. 
   
   When we initially implemented system tests (more than 3 years ago) we went 
the route of "implementing common, generalizable code" and implementing system 
tests usign it for google -> and then it turned out that keeping the 
generalization working for mutliple diferent tests is kinda difficult and 
indeed pollutes the solution a lot.
   
   Since then I got much bigger fan of "in order to make something reusable, it 
should be made usable first". So I'd say the right approach here is to "bubble 
up" the common utils - first let's see how reusable this approach can be for 
other AWS "properties" - not only Athena (we will likely learn that some of 
that is not as reusable as we thought) and then let's try to use similar 
approach in other providers and see a) whether it is needed there, b) whether 
the generalisation brings more problems than it solves. 
   
   I think there is no need to get an AIP or anything like that for those - 
working code and  PRs gradually making stuff re-usable is the best approach.
   
   There is one caveat though. I think (and there are lots of articles and 
blogs about it) there is a big difference between the "real application" code 
and test code. See for example here: 
https://codeshelter.wordpress.com/2011/04/07/dry-and-damp-principles-when-developing-and-unit-testing/
   
   1) For real application code "DRY" is more important than "DAMP"
   2) For test code "DAMP" is more important than "DRY" 
   
   It's never 0/1 of course - some small level of duplication (less DRY) is ok 
even for application code and some small level of "abstraction and 
reusabilitty" (less DAMP) is acceptable (pytest fixtures for example are good 
example of less DAMP more DRY approach). But generally speaking when I look at 
the test code, DAMP is far more important criteria than DRY.  I think there is 
a very thin lline between " the test can be read and understood on its own with 
the common code elsewhere" vs. "I have no idea what the test is doing because 
far too many things happen outside of it". This is the main reason for AIP-47 
at all - the problem with the generalized approach we had was that it was 
rather difficult to understand what the tests were doing because different 
parts of the tests were in different places. 
   
   There is this really nice property that test is standalone and does not 
require pytest to run - just setting one variable. We should keep it. 
Generation of the variable if not set is even cooler and if we can make it 
generic enough to be reused elsewhere, that would be fantastic. 
   
   However we have to remeber to not loose those properties of simplicity and 
readabilityh also for some simpler providers - we should see how the code in 
AWS can be generalized for those - and if we see a way and it does not make 
those tests  unnecessary complex and less readable - by all means, let's do it 
(but not too early).
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to