Hi all, I was talking about a dev project I was working on and there's some progress:
https://github.com/gtoonstra/airflow-hovercraft There are two types of tests: 1. behavior tests: These test the behavior of operators against a stubbed out "hook", which is driven through python "behave" scripts. The behave script reads a bit easier, shows the data and is used to test the behavior of operators against stubbed out versions of hooks. 2. hook tests: these draw in docker containers with popular databases and then test the hook methods against them. There's a provisioning system for these containers using yaml files, where you can tweak, reconfigure and load the containers with other data if that is required. This part is still very immature, but I hope it shows the potential. There are currently single method tests for hive, s3, samba, ftps, mssql, mysql and postgres. The docker containers are managed through "hovertools", which I split out, because it can be reused when testing dags end-to-end. For a dag, you can use the tools to reset containers prior to triggering the DAG. I found 3 issues in airflow hooks along the way: - FTPSHook doesn't get initialized correctly (could be related to python3), the socket doesn't get wrapped and the connection fails. At least in my testing. - The samba hook uses an implementation of smb that's not been updated since 2012 and fails on a string/bytes encoding issue. - The s3 hook couldn't connect to a custom port to faciliate this testing. Best regards, Gerard
