Hi All, Jeremiah and I have been looking into optimising the time that is spend on tests. The reason for this was that Travis’ runs are taking more and more time and we are being throttled by travis. As part of that we enabled color coding of test outcomes and timing of tests. The results kind of …surprising.
This is the top 20 of tests were we spend the most time. MySQL (remember concurrent access enabled) - https://s3.amazonaws.com/archive.travis-ci.org/jobs/205277617/log.txt: tests.BackfillJobTest.test_backfill_examples: 287.9209s tests.BackfillJobTest.test_backfill_multi_dates: 53.5198s tests.SchedulerJobTest.test_scheduler_start_date: 36.4935s tests.CoreTest.test_scheduler_job: 35.5852s tests.CliTests.test_backfill: 29.7484s tests.SchedulerJobTest.test_scheduler_multiprocessing: 26.1573s tests.DaskExecutorTest.test_backfill_integration: 24.5456s tests.CoreTest.test_schedule_dag_no_end_date_up_to_today_only: 17.3278s tests.SubDagOperatorTests.test_subdag_deadlock: 16.1957s tests.SensorTimeoutTest.test_timeout: 15.1000s tests.SchedulerJobTest.test_dagrun_deadlock_ignore_depends_on_past: 13.8812s tests.BackfillJobTest.test_cli_backfill_depends_on_past: 12.9539s tests.SchedulerJobTest.test_dagrun_deadlock_ignore_depends_on_past_advance_ex_date: 12.8779s tests.SchedulerJobTest.test_dagrun_success: 12.8177s tests.SchedulerJobTest.test_dagrun_root_fail: 10.3953s tests.SchedulerJobTest.test_dag_with_system_exit: 10.1132s tests.TransferTests.test_mysql_to_hive: 8.5939s tests.SchedulerJobTest.test_retry_still_in_executor: 8.1739s tests.SchedulerJobTest.test_dagrun_fail: 7.9855s tests.ImpersonationTest.test_default_impersonation: 7.4993s Yes we spend a whopping 5 minutes on executing all examples. Another interesting one is “tests.CoreTest.test_scheduler_job”. This test just checks whether a certain directories are creating as part of logging. This could have been covered by a real unit test just covering the functionality of the function that creates the files - now it takes 35s. We discussed several strategies for reducing time apart from rewriting some of the tests (that would be a herculean job!). What the most optimal seems is: 1. Run the scheduler tests apart from all other tests. 2. Run “operator” integration tests in their own unit. 3. Run UI tests separate 4. Run API tests separate This creates the following build matrix (warning ASCII art): —————————————————————————————————————— | | Scheduler | Operators | UI | API | —————————————————————————————————————— | Python 2 | x |. x | x | x | —————————————————————————————————————— | Python 3 | x | x | x | x | —————————————————————————————————————— | Kerberos | | | x | x | —————————————————————————————————————— | Ldap | | | x | | —————————————————————————————————————— | Hive | | x | x | x | —————————————————————————————————————— | SSH | | x | | | —————————————————————————————————————— | Postgres | x | x | x | x | —————————————————————————————————————— | MySQL | x | x | x | x | —————————————————————————————————————— | SQLite | x | x | x | x | —————————————————————————————————————— So from this build matrix one can deduct that Postgres, MySQL are generic services that will be present in every build. In addition all builds will use Python 2 and Python 3. And I propose using Python 3.4 and Python 3.5. Furthermore, I would like us to label our tests correctly, e.g. unit test or integration test.