Hey Gerard, Thanks for sharing this! It looks like a really solid idea, and I'm looking forward to seeing where you take it. By chance, are there any docs or examples you've been able to write yet?
Best, Marc ᐧ -- Marc Weil | Lead Engineer | Growth Automation, Marketing, and Engagement | New Relic On Mon, Jun 19, 2017 at 11:13 PM, Gerard Toonstra <[email protected]> wrote: > Hi all, > > I was talking about a dev project I was working on and there's some > progress: > > https://github.com/gtoonstra/airflow-hovercraft > > > There are two types of tests: > > 1. behavior tests: These test the behavior of operators against a stubbed > out "hook", which is driven through python "behave" scripts. The behave > script reads a bit easier, shows the data and is used to test the behavior > of operators against stubbed out versions of hooks. > > 2. hook tests: these draw in docker containers with popular databases and > then test the hook methods against them. There's a provisioning system for > these containers using yaml files, where you can tweak, reconfigure and > load the containers with other data if that is required. This part is still > very immature, but I hope it shows the potential. There are currently > single method tests for hive, s3, samba, ftps, mssql, mysql and postgres. > > > The docker containers are managed through "hovertools", which I split out, > because it can be reused when testing dags end-to-end. For a dag, you can > use the tools to reset containers prior to triggering the DAG. > > > I found 3 issues in airflow hooks along the way: > - FTPSHook doesn't get initialized correctly (could be related to python3), > the socket doesn't get wrapped and the connection fails. At least in my > testing. > - The samba hook uses an implementation of smb that's not been updated > since 2012 and fails on a string/bytes encoding issue. > - The s3 hook couldn't connect to a custom port to faciliate this testing. > > > Best regards, > > Gerard >
