On Wed, Oct 30, 2019 at 1:26 PM Chad Dombrova <chad...@gmail.com> wrote: > >> Do you believe that a future mypy plugin could replace pipeline type checks >> in Beam, or are there limits to what it can do? > > mypy will get us quite far on its own once we completely annotate the beam > code. That said, my PR does not include my efforts to turn PTransforms into > Generics, which will be required to properly analyze pipelines, so there's > still a lot more work to do. I've experimented with a mypy plugin to smooth > over some of the rough spots in that workflow and I will just say that the > mypy API has a very steep learning curve. > > Another thing to note: mypy is very explicit about function annotations. It > does not do the "implicit" inference that Beam does, such as automatically > detecting function return types. I think it should be possible to do a lot > of that as a mypy plugin, and in fact, since it has little to do with Beam it > could grow into its own project with outside contributors.
Yeah, I don't think, as is, it can replace what we do, but with plugins I think it could possibly come closer. Certainly there is information that is only available at runtime (e.g. reading from a database or avro/parquet file could provide the schema which can be used for downstream checking) which may limit the ability to do everything statically (even Beam Java is moving this direction). Mypy clearly has an implementation of the "is compatible with" operator that I would love to borrow, but unfortunately it's not (easily?) exposed. That being said, we should leverage what we can for pipeline authoring, and it'll be a great development too regardless.