GitHub user tOWERoFjOY created a discussion: SQL Check Operators

While there are ways of chaining the SQL operator table/column checks with the 
ShortCirucuitOperator, I believe there should be a setting within the SQL 
operator checks (SQLColumnCheckOperator, SQLTableCheckOperator, etc.) or a new 
operator created to allow these tasks to operate with behavior similar to the 
ShortCircuitOperator with a python callable. That is to say, when the task 
completes the SQL check, it succeeds: then downstream tasks are either skipped 
or executed based on meeting the threshold/check conditions. 

While a failed task skips downstream tasks, my team likes to limit failed tasks 
to cases that require remedy. There are pipelines where we would like the 
downstream ETL tasks simply to be skipped if say a staging table returned by 
one of our API scripts returns no rows to add. In that scenario, the sql column 
check task is not a failed task that requires attention.

I haven't dug into the source code to see how the operators function, but 
encourage discussion on either refining this operator, creating a new one, or 
demonstrating a streamlined method to achieve the desired outcome outlined 
above using current tools. Currently, my team uses SQL alchemy to query the 
database to build a true/false condition within a python callable. It just 
seems the SQL operators are very close to achieving my desired functionality.  

GitHub link: https://github.com/apache/airflow/discussions/56148

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to