On Thu, Jul 8, 2010 at 3:11 PM, Allison Randal <[email protected]> wrote: > add REG, REG, REG (integer/float with boxing/unboxing for PMC) > sub > mul > div
What types of register do we have? Just generic machine-size registers, or are we still differentiating between different types (I N S P) registers? If we have multiple register types still (which I support), are these opcodes polymorphic, or do they just do the same song-and-dance that our current PASM ops do with a short name (add) and a long name (add_n_n_n)? Also, if we're talking about implementing a hash in microcode we need a modulus. > set REG, REG (by value, set from constant value or another register, > boxing/unboxing for int, num, str, pmc) Again, is this a single SET operation, or a family of related ops like our current PASM set_i_i, set_i_p, set_p_i, ...)? I would suggest it would be much cleaner if we don't allow all ops to take constants for arguments, we should have an op (or family of ops with the same short name) to load constants from the constants table into a register. All other ops should deal only with registers or memory. > goto LABEL > if BOOL, LABEL > iseq > isgt > islt > and > or I'm all for minimalism, but is there any reason not to include logical NOT and XOR? An "unless BOOL, LABEL" op would be good too, if we don't have NOT. Almost all modern processor architectures have microcode operators for <= and >= also, so it makes sense. Since Lorito's design up until this point has really focused on compatibility with JIT, we don't lose anything by providing the same kinds of ops that the machine provides. We do want to try to keep the number of ops much smaller than what PASM offers now (64ish should be fine), but this doesn't have to be an academic exercise in absolute minimalism. It's more important that we have a functional and usable opset with performance and practicality concerns in mind. We don't want to be in a situation where we need to be constantly using sequences of two or three Lorito ops to perform common operations that the hardware can do in one cycle. Optimizers can typically reduce the sequence down through peepholes and strength reductions, but optimizers add a runtime cost that we don't always want to pay. > new PMC, PMC (create an instance object from an existing class object or > "struct" definition, which was defined using declarative syntax) > > read (fill a register from stdin, absolute minimum for testing) > write (write a register to stdout, absolute minimum for testing) I would suggest we replace these with memory load and store operations for dereferencing pointers to RAM. IO can still be done through PMC, or we could easily set up a memory map to do file output for testing. > setattr (set/get a class attribute or "struct" member of a PMC) > getattr > call (a vtable function on a PMC, passing a varargs argument signature) This calls a VTABLE on a PMC. How do we call an ordinary C-style function? I would suggest we have a "vcall REG, REG, ARGS" to call a vtable on a PMC, and a "ccall REG, REG, ARGS" to call a cdecl function, such as in an NCI library or in Parrot core. Having a builtin "pcall REG, REG, REG" op to call a Parrot method would be nice too, since it gives us a nice encapsulation boundary for PCC calls (and really shows them as being fundamental control flow operations). > load (bytecode file) As a general nit, I would suggest "loadbc" in case we wanted to have a "load" op for memory access, or a "load_const" to load a value from the constants table into a register. The term "load" is just far too easy to overload. > end (halt the interpreter cleanly) > > ------ > > As a side note, if we have dynamic vtables, there's not so much reason to > make strings a separate type from PMCs. +1 > Can we drop the 'PMC' name and just call them objects? Well, I'm not necessarily in favor of the term "PMC", but they aren't really "objects" in the way that the Object PMC is an object with a particular object model. Traditionally we've used the term "PMC" to refer to primitive types and "object" to refer to PMCs of type Object. If we call all PMCs objects, then we lose a little bit of clarity. > One alternative to the variable number of arguments on 'call' is a series of > 'pusharg' ops before it, but I'd rather preserve the abstraction. I would probably prefer the use of pusharg (and poparg inside the called function itself), since that's the way arguments are actually passed at the machine level (and hence the form it will need to be put in by the JIT and AOT compilers anyway). A certain amount of sugar in the Lorito assembler would still preserve the abstraction if we absolutely needed it. Let's just try to remember that Lorito is really intended to be friendly to the machine, not necessarily friendly to the programmer. We could use a higher-level "language" like PIR if we want programmer-friendly syntax and all sorts of added abstractions. > The invocation of sub/method objects will happen by building up a > callcontext object, and calling the 'invoke' vtable function on a PMC, > passing it the call context object as an argument, and receiving the result > context object as the return. This is fine, though I would argue for the benefits of a "fast path" kind of invocation where we don't have an invocant and we don't have any arguments. Needing to always create a callcontext object beforehand to pass arguments, even if we don't have any arguments, really hurts our chances for inlining and tracing JIT optimizations. > There is another alternative in memory allocation at this level of the > microcode, and that is to allocate raw blocks of memory of a requested size, > and allow direct manipulation of that memory. On the whole, this is one of > the most error-prone aspects of C (user error, that is), and makes it harder > to abstract away multiple garbage collection models. But, we may absolutely > need it to implement some of the lower-level features. We can't write PMCs in Lorito if we can't access raw memory buffers. Of course, we could have C API functions to allocate and manipulate buffers, and call those functions for every operation. I definitely don't think that's the best, but it is possible. The more we can write in Lorito, the more we can expose to the JIT, which means increased opportunities for optimization. The more capable Lorito is, the more we can write in it. Obviously we would like to strike a nice balance here, so we can't be too aggressively minimalist. One option is to have the raw memory-access features be only available in code pages marked "insecure", so some internals code and PMCs (dynpmcs too) could use direct memory access but everything else cannot. This helps to enforce a separation of responsibilities, and show that PMCs are the required mechanism for playing with memory. --Andrew Whitworth _______________________________________________ http://lists.parrot.org/mailman/listinfo/parrot-dev
