Tony Lownds wrote: > > Paul Boddie wrote: > > > > What's the general opinion on systems which attempt to infer and > > predict inappropriate type usage?
[...] > > Couldn't such systems be a better aid to program reliability? > > Would "optional" type declarations be relevant to the operation > > of such systems? > > I've been hacking on such a system, called "t_types". It is in pre- > release form right now. It actually deduces type usage using bytecode > simulation, before run-time. I've been working on something similar, which is partly why I asked the question. See here: http://www.python.org/pypi/analysis However, it's not that similar: it traverses the AST (from the compiler module) for want of a representation that doesn't drop useful information from the program. For a more efficient approach, I can imagine a virtual machine which operates on types and constraints as opposed to actual data, and I imagine that the PyPy "flow object space" might do something like this, but finding some kind of concise confirmation of such suspicions in the PyPy documentation is often a challenge. There are other systems which take similar approaches: Shedskin, Starkiller, Pylint (not the Logilab one), along with various other experiments. It makes me think that a common framework, possibly involving some of the Logilab projects (as previously suggested to me by one of their developers), may provide a good opportunity for consolidation here. > For t_types, starting the simulation with types more specific than > "anything" is important for reasonable results. In general I think > optional type declarations are relevant to such systems, whether a > special syntax is adopted or decorators are used. Is there some kind of unstated consensus on whole-program analysis? Collin Winter expressed some reservations in a private communication, but I'd be interested to hear why people seem to disregard that approach so lightly. I don't doubt that accurate type inference is a hard problem - Mark Dufour's thesis provides some level of confirmation of that - but I'm inclined to believe that there's a wide open space between today's Python and ubiquitous type declarations that could hold a more appropriate solution. [...] > from t_types import t > > @t.signature(t.int|t.None, returns=t.int) > def test43(foo=None): > if None is 1: > # should be dead code > return '' > if 1 is None: > # should be dead code > return '' > if foo is None: > return 1 > else: > # foo should have type of t.int here > return foo > > @t.signature(returns=t.list[t.int]) > def test26(): > x = [] > x[0] = 1 > return x Stripping the import and decorators from this and running it through the analysis tools doesn't identify dead code, mostly because I haven't implemented optimisations for the identity operators in the way envisaged above (although it is done for isinstance calls), but it does tell you what the return types are, provided there are some calls to the above functions. Whilst that is almost like declaring the types, it is only necessary to provide usage of a function at the top of a "call hierarchy" - you wouldn't write an explicit call to test26 if such invocations already existed in other places. Accurate prediction of the contents of things like lists can be hard, though, but instead of declaring such things everywhere, one thing I mentioned in my communications with Collin Winter was the possibility of "semantic" marking of such types: instead of stating list[int], or to take a more illustrative example, instead of stating list[Element|Text], you refer to an ElementList (possibly by subclassing list) and then have the inferencer work out that only Element and Text objects are ever stored in such objects. Shedskin manages to find such things out all by itself, but I'm not certain that it can always do so. Having some level of abstract type annotation, whilst not as nice as having the system work everything out by itself, is certainly better than the maintenance nightmare of going round adjusting type declarations upon a minor type-related modification to a program, however. Paul _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com