On 3 July 2010 08:56, Bengt Richter <[email protected]> wrote: > On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote: > A thought/question: > > Could/does JIT make use of information in an assert statement? E.g., could we > write > assert set(type(x) for x in img) == set([float]) and len(img)==640*480 > in front of a loop operating on img and have JIT use the info as assumed true > even when "if __debug__:" suites are optimized away?
There are several reasons we can't make use of such information from the JIT at the moment. It requires more information that we have, and it is difficult to analyse quickly. If img is visible from outside the current thread, for example, the ad-hoc memory model of the python language means we would have to order writes and reads to img from other threads with the JIT's own accesses. Similarly, functions that we call may insert objects that break this invariant. Determining when this may occur requires analysing a lot of code - for example, if *one* type was not int, it could implement a __radd__ method that broke the invariant. It's typically faster to just execute the code than to find out. In the presence of whole-program optimisation this sort of thing is possible, with the right analysis it may be possible within the JIT, but the question remains as to if it will be profitable. (This is an area I have been exploring, but don't hold your breath for results.) On 3 July 2010 10:38, Bengt Richter <[email protected]> wrote: > On 07/02/2010 04:14 PM Amaury Forgeot d'Arc wrote: >> If efficient python code needs this, I'd better write the loop in C >> and explicitly choose the types. >> The C code could be inlined in the python script, and compiled on demand. >> At least you'll know what you get. >> > Well, even C accepts hints like 'register' (and may ignore you, so you are > not truly sure what you get ;-) > > The point of using assert would be to let the user remain within the python > language, while still passing > useful hints to the compiler. Interesting you mention racket. Racket comes with a static language that integrates with their usual dynamic Scheme. Many common lisp implementations provide optional typing. Paolo recently bemoaned the trend toward writing modules at interp level for speed* - I'm not really sure if it is a trend now or not - but at some point it might be fun looking at optional typing annotations that compile the case for those assumptions. It might be a precursor to cython or pyrex support. * with justification : though ok for the stdlib, translating pypy every time you add an extension module is going to get old. fast. > Could such assertions allow e.g. a list to be implemented as a homogeneous > vector > of unboxed representations? Pypy is already great in terms of data layout, for example pypy uses shadow classes in the form of 'structures', but supporting more complicated layout optimisations (such as row or column order storage for structures so the JIT can do relational algebra) would probably be unique. It doesn't seem so far off considering that in the progression (list int) -> (list unpacked tuple int) -> (list unpacked homogenous structure), the first step, limiting or otherwise determining the item type, is the most complicated. > If I wanted to mix languages (not uninteresting!), I'd go with > racket (the star formerly known as PLT-scheme) -- possible can of worms -- As for mixing languages, that is the pinnacle of awesome; but this is probably not the list for it. MLVMs such as JVM+JSR-292, Racket, GNU Guile, and Parrot; it seems to me that once you settle on an execution / object model and / or bytecode format, you've already decided what languages (where the 's' seems superfluous) support is going to be first class for. Don't get me wrong, I find each of these really exciting, but good multi-platform integration is a much harder problem than writing a few compilers with a common bytecode format; and even the common bytecode format is probably not a good idea, because different languages need (really) different primatives, as pirate has bought out. Other impedance mismatches, such as calling conventions (eg, javascript and lua functions silently accepting an incorrect number of arguments), reduction methods (applicative vs normal order vs call-by-name), mutable strings, TCE, various type systems involving structural types, Oliviera/Sulzmann classes, existential types, dependant types, value types, single and multiple inheretance, and the completely insane (prolog) make implementing real multi-language platforms a mammoth task. And even if you manage to get that working, how do you make exception hierarchies work? Why can't I cast my Java ArrayList as a C# ArrayList? etc. Sure, you could probably hook up a few of the bundled VMs, IO or E would make for a great twisted integration DSL. But actually convincing people to lock themselves into an unstandardised, unproven chimera? Lets just say that doing multi-language right is NP-hard. Doing it while targeting JVM and CLI, offering platform integration while supporting exotic language constructs like real continuations? Likely impossible. It's a nice idea, but probably out of Pypy's scope. -- William Leslie _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
